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Dear Sir: 

This Appeal Brief, filed in connection with the above captioned patent application, is 
responsive to the Final Office Action mailed on August 11, 2005. A Notice of Appeal was filed 
herein on October 28, 2005. This Appeal Brief is being timely filed requesting a one-month 
Extension of Time with the required fees. Appellants hereby appeal to the Board of Patent 
Appeals and Interferences from the final rejection in this case. 

In addition, Appellants request the PTO to take note of the Revocation and Power of 

Attorney and Change of Address filed on February 20, 2003, and kindly direct all future 

correspondence to the address indicated, i.e., to: 

CUSTOMER NO. 35489 
Ginger R. Dreger 



HELLER EHRMAN LLP 
275 Middlefield Road 
Menlo Park, California 94025 
Telephone: (650) 324-7000 
Facsimile: (650)324-0638 

The following constitutes the Appellants' Brief on Appeal. 



ON 



<33 



CU 

s 

r- 

CT" 



a 

ac 

& cu 



1 

CO 



o cu 
cu 



OJ cu 



I. REAL PARTY IN INTEREST 

The real party in interest is Genentech, Inc., South San Francisco, California, by an 
assignment of the parent application, U.S. Patent Application Serial No. 09/941,992 recorded 
November 16, 2001, at Reel 012176 and Frame 0450. 

II. RELATED APPEALS AND INTERFERENCES 

The claims pending in the current application are directed to a polypeptide referred to 
herein as "PR01281." There exist two related patent applications, (1) U.S. Patent Application 
Serial No. 09/989,726, filed November 19, 2001 (containing claims directed to nucleic acids 
encoding PR01281 polypeptides), and (2) U.S. Patent Application Serial No. 09/993,604, filed 
November 14, 2001 (containing claims directed to PR01281 polypeptides). U.S. Patent 
Application Serial No. 09/989,726 (nucleic acid case) has been allowed and the issue fee has 
been paid. The related U.S. Patent Application Serial No. 09/993,604 application is also under 
final rejection by the same Examiner and based upon the same outstanding rejections is being 
appealed independently and concurrently herewith. 

III. STATUS OF CLAIMS 

Claims 119-121 and 123 are in this application. 
Claims 1-118, 122 and 124 have been canceled. 

Claims 119-121 and 123 stand rejected and Appellants appeal the rejection of these 

claims. 

IV. STATUS OF AMENDMENTS 

All previous amendments have been entered. A copy of the rejected claims in the present 
Appeal is provided as Appendix A. 

V. SUMMARY OF CLAIMED SUBJECT MATTER 

The invention claimed in the present application is related to an isolated antibody and 

antibody fragments that specifically binds to the polypeptide of SEQ ID NO: 326 (Claim 119), 

referred to in the present application as "PR0128 1 ." The invention is further directed to 

monoclonal antibodies (Claim 120), humanized antibodies (Claim 121), and labeled antibodies 

(Claim 123) that specifically bind to the polypeptide of SEQ ID NO: 326. The PR01281 gene 
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was shown for the first time in the present application to be significantly amplified in human 
colon cancers as compared to normal, non-cancerous human tissue controls (Example 170). 

Support for the preparation and uses of antibodies is found throughout the specification, 
including, for example, pages 390-395. The preparation of antibodies is described in 
Example 144, while Example 145 describes the use of the antibodies for purifying the 
polypeptides to which they bind. Isolated antibodies are defined in the specification at page 315, 
line 3 1 . Support for monoclonal antibodies is found in the specification at, for example, 
page 390, line 17, to page 392, line 3. Support for humanized antibodies is found in the 
specification at, for example, page 392, line 4, to page 393, line 6. Support for antibody 
fragments is found in the specification at, for example, page 314, line 30 onwards. Support for 
labeled antibodies is found in the specification at, for example, page 316, lines 3. 

The amino acid sequence of the native "PRO 1281" polypeptide and the nucleic acid 
sequence encoding this polypeptide (referred to in the present application as "DNA59820-1549") 
are shown in the present specification as SEQ ID NOs: 326 and 325, respectively, and in 
Figures 233 and 232, described on page 299, lines 9-12. The full-length PR01281 polypeptide 
having the amino acid sequence of SEQ ED NO:326 is described in the specification at, for 
example, on page 31 and pages 209-21 1 and the isolation of cDNA clones encoding PRO 1281 of 
SEQ ID NO:326 is described in Example 102, page 485-486 of the specification. 

Finally, Example 170, in the specification at page 539, line 19, to page 555, line 5, sets 
forth a 'Gene Amplification assay' which shows that the PRO 1281 gene is amplified in the 
genome of certain human colon cancers (see Table 9, page 554). The profiles of various primary 
colon tumors used for screening the PRO polypeptide compounds of the invention in the gene 
amplification assay are summarized on Table 8, page 546 of the specification. 

VI. GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL 

1 . Whether Claims 119-121 and 123 are entitled to the priority date of based on a 
proper priority claim to U.S. Provisional Patent Application Serial No. 60/141037, filed 
June 23, 1999. 

2. Whether Claims 119-121 and 123 satisfy the utility/ enablement requirement 
under 35 U.S.C. §§101/112, first paragraph. 
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3. (a) Whether Claims 119-121 and 123 are anticipated under 35 U.S.C. § 102(b) 
by Baker (WO99/63088 - dated December 1999). 

(b) Whether Claims 119-121 and 123 are anticipated under 35 U.S.C. § 102(a) 
by Tang (WO 01/53312 - dated July 2001). 

4. Whether Claims 119-121 and 123 are patentable under 35 U.S.C. §103(a) over 
Weimann (2001) in view of Tang et al. (WO 01/53312 - dated July 2001). 

VII. ARGUMENTS 
Summary of the Arguments: 

Issue 1; U.S. Provisional Patent Application Serial No. 60/141,037 Provides Proper Priority 
Claim for Instant Application 

The instant application has not been granted the earlier priority date on the grounds that 
the 60/141037 application fails to provide utility under 35 U.S.C. §101. For the same detailed 
reasons discussed below under Issue 2 (utility) for the instant application, Appellants maintain 
that the results of the gene amplification assay for PR01281 was sufficiently disclosed in U.S. 
Provisional Patent Application Serial No. 60/141,037, filed June 23, 1999, to which proper 
priority has been claimed in this application. Hence, the present application should be entitled to 
at least the priority date of June 23, 1999 for the instant application. 

Issue 2: Utility/ Enablement 

Appellants rely upon the gene amplification data of the PR01281 gene for patentable 
utility of the PR01281 polypeptides. This data is clearly disclosed in the instant specification in 
Example 170 which discloses that the gene encoding PRO 1281 showed significant amplification, 
ranging from 2.099 fold to 2.219-fold in different colon primary tumors . Therefore, such a gene 
is useful as a marker for the diagnosis of colon cancer , and for monitoring cancer development 
and/or for measuring the efficacy of cancer therapy. 

The Examiner asserted in the Final Office Action mailed August 11, 2005 that 
amplification of the PRO 1281 polynucleotide does not impart a specific, substantial, and credible 
utility for the PR01281 polypeptide and its antibodies. In support of this assertion, the Examiner 
cited references by Pennica et al., Konopka et al, Haynes et al. and Hu et al. and also 
maintained previous rejections based on Sen et al. 
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Appellants submit that, the teachings of Pennica et al. and Konopka et al. are not 
directed towards genes in general but to a single gene or genes within a single family and thus, 
their teachings cannot support a general conclusion regarding correlation between gene 
amplification and mRNA or protein levels. Further, Appellants submit that the teachings of 
Haynes et al. in fact, meets the "more likely than not standard" and shows that a positive 
correlation exists between mRNA and protein . And based on the nature of the statistical analysis 
performed in one class of genes in Hu et al, the Examiner's conclusions are not reliably 
supported . Thus, Appellants submit that these references do not conclusively establish a prima 
facie case for lack of utility. 

Appellants further submit that Sen et al. in fact support the Appellants' position that even 
aneuploidy, which may be a feature of either cancerous or pre-cancerous tissue or damaged 
tissue, is still useful to diagnose the propensity towards cancer or to diagnose cancer itself . 

Appellants had also submitted ample evidence to show that, in general, if a gene is 
amplified in cancer, it is more likely than not that the encoded protein will also be expressed at 
an elevated level. First, the articles by Orntoft et al., Hyman et al., and Pollack et al. (made of 
record in Appellants' Response filed June 28, 2004) collectively teach that in general, gene 
amplification increases mRNA expression . Second, the Declaration of Dr. Paul Polakis (made of 
record in Appellants' Response filed June 28, 2004), principal investigator of the Tumor Antigen 
Project of Genentech, Inc., the assignee of the present application, shows that, in general there is 
a correlation between mRNA levels and polypeptide levels . 

Appellants further note that the sale of gene expression chips to measure mRNA levels is 
a highly successful business, with a company such as Affymetrix recording 168.3 million dollars 
in sales of their GeneChip arrays in 2004 alone. Clearly, the research community believes that 
the information obtained from these chips is useful {i.e., that it is more likely than not 
informative of the protein level). 

Taken together, although there are some examples in the scientific art that do not fit 
within the central dogma of molecular biology that there is a correlation between DNA, mRNA, 
and polypeptide levels, these instances are exceptions rather than the rule . In the majority of 
amplified genes , as exemplified by Orntoft et al, Hyman et al, Pollack et al, the Polakis 
Declaration and the widespread use of array chips, the teachings in the art overwhelmingly show 
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that gene amplification influences gene expression at the mRNA and protein levels . Therefore, 
one of skill in the art would reasonably expect in this instance, based on the amplification data 
for the PR01281 gene, that the PR01281 polypeptide is concomitantly overexpressed. Thus, the 
claimed antibodies to PR01281 polypeptides also have utility in the diagnosis of cancer. 

Appellants further submit that even if there were no correlation between gene 
amplification and increased mRNA/protein expression, (which Appellants expressly do not 
concede), a polypeptide encoded by a gene that is amplified in cancer would still have a specific, 
substantial, and credible utility. Appellants submit that, as evidenced by the Ashkenazi 
Declaration and the teachings of Hanna and Mornin (both made of record in Appellants' 
Response filed June 28, 2004), simultaneous testing of gene amplification and gene product 
over-expression enables more accurate tumor classification, even if the gene-product, the protein, 
is not over-expressed. This leads to better determination of a suitable therapy for the tumor, as 
demonstrated by a real-world example of the breast cancer marker HER-2/neu. Accordingly, 
Appellants submit that when the proper legal standard is applied, one should reach the 
conclusion that the present application discloses at least one patentable utility for the claimed 
PRO 1281 polypeptides and its antibodies thereof. 

Accordingly, one of ordinary skill in the art would also understand how to make and use 
the recited antibodies for the diagnosis of colon cancer without any undue experimentation. 

Issue 3a: Anticipation Under 35 U.S.C. $1 02(b) by Baker (WO99/63088 - Dated 
December 1999) 

For the reasons discussed under Issue 2 on utility, Appellants believe that they have 
priority to U.S. Provisional Patent Application Serial No. 60/141,037, filed June 23, 1999, to 
which a proper priority claim has been made and which discloses the gene amplification assay 
results for the PRO 1281 gene. Therefore, Baker et al. is not prior art. 

Issue 3b: Anticipation Under 35 U.S.C. S102(a) bv Tang (WO 01/53312- Dated July 2001) 

For the reasons discussed under Issue 2 on utility, Appellants believe that they have 
priority to U.S. Provisional Patent Application Serial No. 60/141,037, filed June 23, 1999, to 
which a proper priority claim has been made and which discloses the gene amplification assay 
results for the PR01281 gene. Therefore, Tang et al. is not prior art. 
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Issue 4: Patentability Over Weimann (Dated 2001) in View of Tang et al. (WO 01/53312 - 
Dated July 2001) 

For the reasons discussed under Issue 2 on utility, Appellants believe that they have 
priority to U.S. Provisional Patent Application Serial No. 60/141,037, filed June 23, 1999, to 
which a proper priority claim has been made and which discloses the gene amplification assay 
results for the PRO 1281 gene. Therefore, neither Weimann et al. nor Tang et al. are prior art. 

These arguments are all discussed in further detail below under the appropriate headings. 

Response to Rejections 

ISSUE 1. Claims 119-121 and 123 are Supported by a Proper Priority Claim to U.S. 
Provisional Patent Application Serial No. 60/141,037 

The instant application has not been granted the earlier priority date on the grounds that 

the 60/141037 application fails to provide utility under 35 U.S.C. §101. For the detailed reasons 

discussed below under Issue 2, Appellants maintain that they rely on the gene amplification 

assay for patentable utility which was first disclosed in U.S. Provisional Patent Application 

Serial No. 60/141,037, filed June 23, 1999, priority to which has been claimed in this 

application. Hence, the present application is entitled to at least the priority date of 

June 23, 1999. 

ISSUE 2. Claims 119-121 and 123 are Supported by a Credible. Specific and Substantial 
Asserted Utility, and Thus Meet the Utility Requirement of 35 U.S.C. §§101/112, First 
Paragraph 

The sole basis for the Examiner's rejection of Claims 119-121 and 123 under this section 
is that the data presented in Example 170 of the present specification is allegedly insufficient 
under the present legal standards to establish a patentable utility under 35 U.S.C. §101 for the 
presently claimed subject matter. 

Claims 1 1 9- 1 2 1 and 1 23 stand further rejected under 35 U.S.C. § 1 1 2, first paragraph, 
allegedly "since the claimed invention is not supported by either a specific and substantial asserted 
utility or a well established utility for the reasons set forth above, one skilled in the art clearly would 
not know how to use the claimed invention." 

Appellants strongly disagree and respectfully traverse the rejection. 
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A. The Legal Standard For Utility Under 35 U.S.C. §101 

According to 35 U.S.C. §101: 

Whoever invents or discovers any new and useful process, machine, manufacture, 
or composition of matter, or any new and useful improvement thereof, may obtain 
a patent therefor, subject to the conditions and requirements of this title. 
(Emphasis added). 

In interpreting the utility requirement, in Brenner v. Manson, the Supreme Court held 
that the quid pro quo contemplated by the U.S. Constitution between the public interest and the 
interest of the inventors required that a patent Applicant disclose a "substantial utility" for his or 

2 

her invention, i.e., a utility "where specific benefit exists in currently available form." The 
Court concluded that "a patent is not a hunting license. It is not a reward for the search, but 
compensation for its successful conclusion. A patent system must be related to the world of 

commerce rather than the realm of philosophy." 3 

4 

Later, in Nelson v. Bowler, the C.C.P.A. acknowledged that tests evidencing 
pharmacological activity of a compound may establish practical utility, even though they may 
not establish a specific therapeutic use. The Court held that "since it is crucial to provide 
researchers with an incentive to disclose pharmaceutical activities in as many compounds as 
possible, we conclude adequate proof of any such activity constitutes a showing of practical 

utility." 5 

6 

In Cross v. Iizuka, the C.A.F.C. reaffirmed Nelson, and added that in vitro results might 
be sufficient to support practical utility, explaining that "in vitro testing, in general, is relatively 

1 Brenner v. Manson, 383 U.S. 519, 148 U.S.P.Q. (BNA) 689 (1966). 

2 Id. at 534, 148 U.S.P.Q. (BNA) at 695. 

3 Id. at 536, 148 U.S.P.Q. (BNA) at 696. 

4 Nelson v. Bowler, 626 F.2d 853, 206 U.S.P.Q. (BNA) 881 (C.C.P.A. 1980). 

5 Id. at 856, 206 U.S.P.Q. (BNA) at 883. 

6 Cross v. Iizuka, 753 F.2d 1047, 224 U.S.P.Q. (BNA) 739 (Fed. Cir. 1985). 
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less complex, less time consuming, and less expensive than in vivo testing. Moreover, in vitro 
results with the particular pharmacological activity are generally predictive of in vivo test results, 

i.e., there is a reasonable correlation there between." 7 The Court perceived, "No insurmountable 

difficulty" in finding that, under appropriate circumstances, "in vitro testing, may establish a 

g 

practical utility." 

The case law has also clearly established that Appellants' statements of utility are usually 

9 ... 

sufficient, unless such statement of utility is unbelievable on its face. The PTO has the initial 

10 

burden to prove that Appellants' claims of usefulness are not believable on their face. In 
general, an Applicant's assertion of utility creates a presumption of utility that will be sufficient 
to satisfy the utility requirement of 35 U.S.C. §101, "unless there is a reason for one skilled in 

11 12 

the art to question the objective truth of the statement of utility or its scope." ' 

13 

Compliance with 35 U.S.C. §101 is a question of fact. The evidentiary standard to be 
used throughout ex parte examination in setting forth a rejection is a preponderance of the 

14 

totality of the evidence under consideration. Thus, to overcome the presumption of truth that 
an assertion of utility by the Applicant enjoys, the Examiner must establish that it is more likely 

7 Id. at 1050, 224 U.S.P.Q. (BNA) at 747. 



9 In re Gazave, 379 F.2d 973, 154 U.S.P.Q. (BNA) 92 (C.C.P.A. 1967). 



11 In reLanger, 503 F.2d 1380,1391, 183 U.S.P.Q. (BNA) 288, 297 (C.C.P.A. 1974). 

12 See also In re Jolles, 628 F.2d 1322, 206 USPQ 885 (C.C.P.A. 1980); In re Irons, 340 
F.2d 974, 144 USPQ 351 (1965); In re Sichert, 566 F.2d 1154, 1159, 196 USPQ 209,212-13 
(C.C.P.A. 1977). 

13 Raytheon v. Roper, 724 F.2d 951, 956, 220 U.S.P.Q. (BNA) 592, 596 (Fed. Cir. 1983) 
cert, denied, 469 US 835 (1984). 

14 In re Oetiker, 977 F.2d 1443, 1445, 24 U.S.P.Q.2d (BNA) 1443, 1444 (Fed. Cir. 

1992). 
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than not that one of ordinary skill in the art would doubt the truth of the statement of utility. 
Only after the Examiner made a proper prima facie showing of lack of utility, does the burden of 
rebuttal shift to the Applicant. The issue will then be decided on the totality of evidence. 

The well established case law is clearly reflected in the Utility Examination Guidelines 

("Utility Guidelines"), 15 which acknowledge that an invention complies with the utility 
requirement of 35 U.S. C. §101, if it has at least one asserted "specific, substantial, and credible 
utility" or a "well-established utility." Under the Utility Guidelines, a utility is "specific" when 
it is particular to the subject matter claimed. For example, it is generally not enough to state that 
a nucleic acid is useful as a diagnostic without also identifying the conditions that are to be 
diagnosed. 

In explaining the "substantial utility" standard, M.P.E.P. §2107.01 cautions, however, 
that Office personnel must be careful not to interpret the phrase "immediate benefit to the 
public" or similar formulations used in certain court decisions to mean that products or services 
based on the claimed invention must be "currently available" to the public in order to satisfy the 
utility requirement. "Rather, any reasonable use that an applicant has identified for the invention 
that can be viewed as providing a public benefit should be accepted as sufficient, at least with 

regard to defining a 'substantial' utility.'"' 6 Indeed, the Guidelines for Examination of 

17 . , _ „ 

Applications for Compliance With the Utility Requirement, gives the following instruction to 
patent examiners: "If the Applicant has asserted that the claimed invention is useful for any 
particular practical purpose . . . and the assertion would be considered credible by a person of 
ordinary skill in the art, do not impose a rejection based on lack of utility." 

B. Proper Application of the Legal Standard 

Appellants respectfully submit that the data presented in Example 170 starting on 
page 539 of the specification of the specification and the cumulative evidence of record, which 



15 



66 Fed. Reg. 1092 (2001). 



16 



M.P.E.P. §2107.01. 



17 



M.P.E.P. §2107 11(B)(1). 
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underlies the current dispute, indeed support a "specific, substantial and credible" asserted utility 
for the presently claimed invention. 

Patentable utility for the PRO 1281 polypeptides is based upon the gene amplification 
data for the gene encoding the PRO 1281 polypeptide. Example 170 describes the results 
obtained using a very well-known and routinely employed polymerase chain reaction 
(PCR)-based assay, the TaqMan™ PCR assay, also referred to herein as the gene amplification 
assay. This assay allows one to quantitatively measure the level of gene amplification in a given 
sample, say, a tumor extract, or a cell line. It was well known in the art at the time the invention 
was made that gene amplification is an essential mechanism for oncogene activation. Appellants 
isolated genomic DNA from a variety of primary cancers and cancer cell lines that are listed in 
Table 9 (pages 539 onwards of the specification), including primary colon cancers of the type 
and stage indicated in Table 8 (page 546). The tumor samples were tested in triplicates with 
Taqman™ primers and with internal controls, beta-actin and GADPH in order to quantitatively 
compare DNA levels between samples (page 548, lines 33-34). As a negative control, DNA was 
isolated from the cells of ten normal healthy individuals, which was pooled and used as a control 
(page 539, lines 27-29) and also, no-template controls (page 548, lines 33-34). The results of 
TaqMan™ PCR are reported in ACt units, as explained in the passage on page 539, lines 37-39. 
One unit corresponds to one PCR cycle or approximately a 2-fold amplification, relative to 
control, two units correspond to 4-fold, 3 units to 8-fold amplification and so on. Using this 
PCR-based assay, Appellants showed that the gene encoding for PR01281 was amplified, that is,, 
it showed approximately 1.07-1.15 ACt units which corresponds to 21-07 _2 1-15_ f 0 \& 
amplification or 2.099 fold to 2.219-fold in different colon primary tumors . 

However, the Examiner states regarding the teachings of the Goddard Declaration that 

"the argument has been fully considered but is not deemed persuasive. Even though in some 

circumstances and as discussed in the Goddard Declaration, TaqMan™ real-time PCR can 

accurately and reproducibly assess gene amplification, in cancerous tissues it is necessary to 

account for the possibility of aneuploidy." (Page 3, third paragraph of the Final Office Action). 

The Examiner further refers to Sen et al. (Page 3 of the Final Office Action) to show that 

"numeric aberrations in chromosomes, referred to as aneuploidy, is commonly observed in 

human cancers. Therefore, because the gene amplification observed for PRO 1281 is small and 
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could reasonably be expected to be due to aneuploidy, the implicit utility of a colon tumor 

diagnostic is not specific and substantial." 

Appellants respectfully traverse and point out that the Declaration by Dr. Audrey 

Goddard, presented in their response mailed July 22, 2005, provides a statement by an expert in 

the relevant art that "fold amplification" values of at least 2-fold are considered significant in the 

TaqMan™ PCR gene amplification assay. Appellants particularly draw the Board's attention to 

page 3 of the Goddard Declaration which clearly states that: 

It is further my considered scientific opinion that an at least 2-fold increase in 
gene copy number in a tumor tissue sample relative to a normal (i.e., non-tumor) 
sample is significant and useful in that the detected increase in gene copy number 
in the tumor sample relative to the normal sample serves as a basis for using 
relative gene copy number as quantitated by the TaqMan PCR technique as a 
diagnostic marker for the presence or absence of tumor in a tissue sample of 
unknown pathology. Accordingly, a gene identified as being amplified at least 
2-fold by the quantitative TaqMan PCR assay in a tumor sample relative to a 
normal sample is useful as a marker for the diagnosis of cancer, for monitoring 
cancer development and/or for measuring the efficacy of cancer therapy. 
(Emphasis added). 

Accordingly, the 2.099 fold to 2.219-fold in different colon primary tumors would be considered 
significant and credible by one skilled in the art, based upon the facts disclosed in the Goddard 
Declaration. 

Regarding aneuploidy and reference by Sen, Appellants agree with the teachings of Sen. 
In fact, Sen et al. support the Appellants' position since the Examiner himself indicates that 
" aneuploidy is commonly observed in human cancers. " That is, even if the observed increase in 
gene amplification were due to aneuploidy (which Appellants' do not concede to), PR01281 is 
still useful to diagnose the propensity for cancer or colon cancer itself. For instance, many 
articles published around June 23, 1999 (the effective filing date of this application) in colon 
cancer studied damaged or premalignant lesions in colon cancer and suggested that epithelial 
tumors develop through a multistep process driven by genetic instability and that a subset of the 
same molecular changes found in associated tumors were also found in premalignant lesions, 
suggesting that these premalignant lesions might represent precursor lesions for associated 
tumors, i.e., a manifestation of a multistep tumorigenesis process. Based on the well-known art, 
Appellants submit that there is utility in identifying genetic biomarkers in epithelial tissues at 
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cancer risk. For instance, any skilled artisan in the field of colon oncology would easily 
appreciate that early detection of preinvasive colon cancer and a greater understanding of 
premalignant colon conditions provided information in advance about risk assessment, prognosis 
and therapy for colon cancer. 

Further, any skilled artisan in the field of oncology would also appreciate that, not all 
tumor markers are generally associated with every tumor, or even, with most tumors. In fact, 
some tumor markers are useful for identifying rare malignancies . That is, the association of the 
tumor marker with a particular type of tumor lesion may be rare, or, the occurrence of that 
particular kind of tumor lesion itself may be rare. In either event, even these rare tumor markers, 
which may not give a positive hit with most common tumors, have great value in tumor 
diagnosis, and consequently, in tumor prognosis . The skilled artisan would know that such 
tumor markers are very useful for better classification of tumors. Therefore, whether the 
PRO 1281 gene is amplified in colon in most tumors is not relevant to its identification as a tumor 
marker, or its patentable utility. Rather, whether the amplification data for PRO 1281 is 
significant is what lends support to its usefulness as a tumor marker. There the rejection 
indicating that the asserted utility is "not specific or substantial because the gene amplification 
observed is small" is not legally correct. It was well known in the art at the time of filing of the 
application that gene amplification, which occurs in most solid tumors like colon cancers, is 
generally associated with poor prognosis. Therefore, the PR01281 gene becomes an important 
diagnostic marker for identifying malignant colon cancers, even if the malignancy associated 
with PRQ1281 molecule is a rare occurrence . Accordingly, the present specification clearly 
discloses sufficient evidence that the gene encoding the PRO 1281 polypeptide is significantly 
amplified in certain types of colon tumors and is therefore, antibodies to PRO 1281 are valuable 
diagnostic markers for identifying certain types of colon cancers. 

Taken together, even if the observed PR01281 gene amplification were due to 
chromosomal aneuploidy (which Appellants do not concede), such an observation would still 
support at least one utility for the PRO 1281 gene and therefore antibodies to PRO 1281 because it 
helps in identifying individuals at significantly increased cancer risk. Accordingly, the instant 
polypeptides, and their antibodies find utility as a diagnostic for colon cancer or for individuals 
at risk of developing colon cancer. 
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C. A prima facie case of lack of utility has not been established 

As discussed above, the increase in DNA copy number for the PRO 1281 gene is 
significant. Further, the evidentiary standard to be used throughout ex parte examination of a 
patent application is a preponderance of the totality of the evidence under consideration. Thus, 
to overcome the presumption of truth that an assertion of utility by the applicant enjoys, the 
Examiner must establish that it is more likely than not that one of ordinary skill in the art would 
doubt the truth of the statement of utility. Only after the Examiner has made a proper prima 
facie showing of lack of utility, does the burden of rebuttal shift to the Applicant. 

Accordingly, it is not a legal requirement to establish a necessary correlation between an 
increase in the copy number of the DNA and protein expression levels that would correlate to the 
disease state or that it is imperative to find evidence that DNA amplification is " necessarily " or 
"always" associated with overexpression of the gene product. Appellants respectfully submit 
that when the proper evidentiary standard is applied, a correlation must be acknowledged. 
Appellants submit that, the teachings of Pennica et al. and Konopka et al. are not directed 
towards genes in general but to a single gene or genes within a single family and thus, their 
teachings cannot support a general conclusion regarding correlation between gene amplification 
and mRNA or protein levels. For instance, the teachings of Pennica et al. are specific to WISP 
genes, a specific class of closely related molecules. Pennica et al. showed that there was good 
correlation between DNA and mRNA expression levels for the WISP- J gene but not for WISP-2 
and WISPS genes. But, the fact that in the case of closely related molecules, there seemed to be 
no correlation between gene amplification and the level of mRNA/protein expression does not 
establish that it is more likely than not, in general, that such correlation does not exist. As 
discussed above, the standard is not absolute certainty . Pennica et al. has no teaching 
whatsoever about the correlation of gene amplification and protein expression for genes in 
general . Similarly, in Konopka et al, Appellants submit that the Examiner has generalized a 
very specific result disclosed by Konopka et al. to cover all genes. Konopka et al. actually state 
that "[p]rotein expression is not related to amplification of the abl gene but to variation in the 
level of bcr-abl mRNA produced from a single Phi template." (See Konopka et al ., Abstract, 
emphasis added). The paper does not teach anything whatsoever about the correlation of protein 
expression and gene amplification in general , and provides no basis for the generalization that 
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apparently underlies the present rejection. The statement of Konopka et al. that "[p]rotein 
expression is not related to amplification of the abl gene ..." is not sufficient to establish a 
prima facie case of lack of utility. Therefore, the combined teachings of Pennica et al. and 
Konopka et al. are not directed towards genes in general but to a single gene or genes within a 
single family and thus, their teachings cannot support a general conclusion regarding correlation 
between gene amplification and mRNA or protein levels. 

Actually, the cited reference Haynes et al, showed that " there was a general trend, 
although no strong correlation between protein [expression] and transcript levels." (see Figure 1 
and page 1863, paragraph 2.1, last line). Therefore, when the proper legal standard is used, 
Haynes clearly supports the Appellants' position that in general, a positive correlation exists 
between mRNA and protein . This is all that's needed to meet the "more likely than not" 
evidentiary standard. Again, accurate prediction is not the standard . Therefore, a prima facie 
case of lack of utility has not been met based on the cited references Pennica et al, Konopka et 
al. and Haynes et al. 

The Examiner further cited Hu et al. , to show that " the literature cautions researchers 
against drawing conclusions based on small changes in transcript expression levels between 
normal and cancerous tissues" (Page 6 of the Final Office Action mailed August 11, 2005). 

First of all, as discussed above, the increase in DNA copy number for the PRO 1281 gene 
is significant. Further, Appellants respectfully submit that, contrary to the Examiner's assertion, 
the cited Hu et al. reference does not conclusively establish a prima facie case for lack of utility 
for the PRO 1281 molecule. The Hu et al. reference is entitled "Analysis of Genomic and 
Proteomic Data using Advanced Literature Mining" (emphasis added). Therefore, as the title 
itself suggests, the conclusions in this reference are based upon statistical analysis of information 
obtained from published literature, and not from experimental data. Hu et al. performed 
statistical analysis to provide evidence for a relationship between mRNA expression and 
biological function of a given molecule (as in disease). The conclusions of Hu et al. however, 
only apply to a specific type of breast tumor (estrogen receptor (ER)-positive breast tumor) and 
cannot be generalized to breast cancer genes in general, let alone to cancer genes in general. 
Interestingly, the observed correlation was only found among ER-positive (breast) tumors not 
ER-negative tumors." (See page 412, left column). 
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Moreover, the analytical methods utilized by Hu et al. have certain statistical drawbacks, 
as the authors themselves admit. For instance, according to Hu et al, "different statistical 
methods" were applied to "estimate the strength of gene-disease relationships and evaluated the 
results." (See page 406, left column, emphasis added). Using these different statistical methods, 
Hu et al. "[a]ssessed the relative strengths of gene-disease relationships based on the frequency 
of both co-citation and single citation." (See page 411, left column). As is well known in the 
art, different statistical methods allow different variables to be manipulated to affect the resulting 
outcome. In this regard, the authors disclose that, "Initial attempts to search the literature " using 
the list of genes, gene names, gene symbols, and frequently used synonyms generated by the 
authors "revealed several sources of false positives and false negatives." (See page 406, right 
column). The authors add that the false positives caused by "duplicative and unrelated meanings 
for the term" were "difficult to manage." Therefore, in order to minimize such false positives, 
Hu et al. disclose that these terms "had to be eliminated entirely, thereby reducing the false 
positive rate but unavoidably under-representing some genes." Id. (emphasis added). Hence, Hu 
et al. had to manipulate certain aspects of the input data, in order to generate, in their opinion, 
meaningful results. Further, because the frequency of citation for a given molecule and its 
relationship to disease only reflects the current research interest of a molecule, and not the true 
biological function of the molecule, as the authors themselves acknowledge, the "[relationship 
established by frequency of co-citation do not necessarily represent a true biological link." (See 
page 411, right column). Therefore, based on these findings, the authors add, "[fjhis may reflect 
a bias in the literature to study the more prevalent type of tumor in the population. Furthermore, 
this emphasizes that caution must be taken when interpreting experiments that may contain 
subpopulations that behave very differently." Id. (Emphasis added). In other words, some 
molecules may have been underrepresented merely because they were less frequently cited or 
studied in literature compared to other more well-cited or studied genes. Therefore, Hu et al. 's 
conclusions are not based on genes/mRNA in general. 

Therefore, Appellants submit that, based on the nature of the statistical analysis 
performed herein, and in particular, based on Hu's analysis of one class of genes, namely, the 
estrogen receptor (ER)-positive breast tumor genes, the conclusions drawn by the Examiner, 
namely that, "genes displaying a 5-fold change or less (mRNA expression) in tumors compared 
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to normal showed no evidence of a correlation between altered gene expression and a known role 
in the disease (in general)" is not reliably supported. 

Therefore, when the proper legal standard is used, a prima facie case of lack of utility has 
not been met based on the cited reference Hu et al. either. 

On the contrary, Appellants submit that Example 170 in the specification further 
discloses that, "(amplification is associated with overexpression of the gene product, indicating 
that the polypeptides are useful targets for therapeutic intervention in certain cancers such as 
colon, lung, breast and other cancers and diagnostic determination of the presence of those 
cancers" (emphasis added). Besides, Appellants have submitted ample evidence (discussed 
below) to show that, in general, if a gene is amplified in cancer, it is "more likely than not" likely 
that the encoded protein will also be expressed at an elevated level. 

For support, Appellants presented the articles by Orntoft et al. , Hyman et al. , and Pollack 
et al. (made of record in Appellants' Response filed June 28, 2004), who collectively teach that 
in general, for most genes, DNA amplification increases mRNA expression . The results 
presented by Orntoft et al, Hyman et al, and Pollack et al. are based upon wide ranging 
analyses of a large number of tumor associated genes. Orntoft et al. studied transcript levels of 
5600 genes in malignant bladder cancers, many of which were linked to the gain or loss of 
chromosomal material, and found that in general (18 of 23 cases) chromosomal areas with more 
than 2-fold gain of DNA showed a corresponding increase in mRNA transcripts. Hyman et al. 
compared DNA copy numbers and mRNA expression of over 12,000 genes in breast cancer 
tumors and cell lines, and found that there was evidence of a prominent global influence of copy 
number changes on gene expression levels. In Pollack et al, the authors profiled DNA copy 
number alteration across 6,691 mapped human genes in 44 predominantly advanced primary 
breast tumors and 10 breast cancer cell lines, and found that on average, a 2-fold change in DNA 
copy number was associated with a corresponding 1.5-fold change in mRNA levels. In 
summary, the evidence supports the Appellants' position that gene amplification is more likely 
than not predictive of increased mRNA and polypeptide levels. 

Second, the Declaration of Dr. Paul Polakis (made of record in Appellants' Response 
filed June 4, 2004), principal investigator of the Tumor Antigen Project of Genentech, Inc., the 
assignee of the present application, explains that in the course of Dr. Polakis' research using 
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microarray analysis, he and his co-workers identified approximately 200 gene transcripts that are 
present .in human tumor cells at significantly higher levels than in corresponding normal human 
cells. Appellants submit that Dr. Polakis' Declaration was presented to support the position that 
there is a correlation between mRNA levels and polypeptide levels, the correlation between gene 
amplification and mRNA levels having already been established by the data shown in the Orntoft 
et al, Hyman et al, and Pollack et al articles. Appellants further emphasize that the opinions 
expressed in the Polakis Declaration, including in the above quoted statement, are all based on 
factual findings. For instance, antibodies binding to about 30 of these tumor antigens were 
prepared, and mRNA and protein levels were compared. In approximately 80% of the cases , the 
researchers found that increases in the level of a particular mRNA correlated with changes in the 
level of protein expressed from that mRNA when human tumor cells are compared with their 
corresponding normal cells . Therefore, Dr. Polakis' research, which is referenced in his 
Declaration, shows that, in general, there is a correlation between increased mRNA and 
polypeptide levels . 

Appellants further note that the sale of gene expression chips to measure mRNA levels is 
a highly successful business, with a company such as Affymetrix recording 168.3 million dollars 
in sales of their GeneChip® arrays in 2004. Clearly, the resear ch community believe that the 
information obtained from these chips is useful {i.e., that it is more likely than not that the results 
are informative of protein levels). 

Taken together, all of the submitted evidence supports the Appellants' position that, in the 
majority of amplified genes , increased gene amplification levels, more likely than not, predict 
increased mRNA and polypeptide levels, which clearly meets the utility standards described 
above. Hence, one of skill in the art would reasonably expect that, based on the gene 
amplification data of the PR01281 gene, the PR01281 polypeptide is concomitantly 
overexpressed in the colon tumors studied as well and hence PRO 1281 antibodies would be 
useful in the diagnosis of cancer. 

Appellants further submit that even if there were no correlation between gene 
amplification and increased mRNA/protein expression, (which Appellants expressly do not 
concede), a polypeptide encoded by an amplified gene in cancer would still have a specific, 
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substantial, and credible utility, as was discussed in the Declaration of Dr. Avi Ashkenazi 
(submitted with Appellants' Response filed June 4, 2004). 

According to the Declaration, even if over-expression of the gene product does not 
parallel gene amplification in certain tumor types, parallel monitoring of gene amplification and 
gene product over-expression enables more accurate tumor classification and hence better 
determination of suitable therapy. In addition, absence of over-expression is crucial information 
for the practicing clinician. If a gene is amplified in a tumor, but the corresponding gene product 
is not over-expressed, the clinician will decide not to treat a patient with agents that target that 
gene product. This not only saves money, but also has the benefit that the patient can avoid 
exposure to the side effects associated with such agents. 

This utility is further supported by the teachings of the article by Hanna and Mornin. 
(Pathology Associates Medical Laboratories, August (1999), submitted with the Response filed 
June 4, 2004). The article teaches that the HER-2/neu gene has been shown to be amplified 
and/or over-expressed in 10%-30% of invasive breast cancers and in 40%-60% of intraductal 
breast carcinomas. Further, the article teaches that diagnosis of breast cancer includes testing 
both the amplification of the HER-2/neu gene (by FISH) as well as the over-expression of the 
HER-2/neu gene product (by IHC). Even when the protein is not over-expressed, the assay 
relying on both tests leads to a more accurate classification of the cancer and a more effective 
treatment of it. 

However, the Examiner asserts regarding Hanna and Mornin that, 

"Hanna et al. go on to state that FISH (gene) and IHC (protein) results correlate 
well. However, subsets of tumors are found which show discordant results; i.e., 
protein overexpression without gene amplification or lack of protein 
overexpression with gene amplification. The clinical significance of such results 
is unclear. Therefore, the issues of HER-2 cannot be generalized to any gene 
expressed in a tumor," (last four lines of Page 4 of the Final Office Action mailed 
August 11,2005). 

Again, Appellants respectfully submit, and as the Examiner himself acknowledges, the gene 
amplification (as measured by FISH) and polypeptide expression (as measured by 
immunohistochemistry, IHC) are well correlated ("in general, FISH and IHC results correlate 
well" (Hanna et al. p. 1, col. 2)). It is only a subset of tumors which show discordant results. 
The Examiner appears to view such results as "unclear. On the other hand, Appellants' submit 
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that Hanna et al. in fact support the Appellants' position rather well in that it is more likely than 
not that gene amplification correlates with increased polypeptide expression, and this conclusion 
is not drawn based on the Her-2 results of Hanna and Mornin alone, but on the overwhelming 
evidence presented in the Orntoft et al, Hyman et al, and Pollack et al. references and the 
Polakis Declaration (made of record in Appellants' Response filed June 4, 2004). 

Thus, based on the asserted utility for PRO 1281 in the diagnosis of selected colon 
tumors, the reduction to practice of the instantly claimed protein sequence of SEQ ID NO: 326 in 
the present application, the disclosure of the step-by-step protocols for making PRO 
polypeptides, the disclosure of a step-by-step protocol for making and expressing PR01281 in 
appropriate host cells (in Examples 140-143 and page 376, line 12), the step-by-step protocol for 
the preparation, isolation and detection of monoclonal, polyclonal and other types of antibodies 
against the PRO 1281 protein in the specification (at pages 390-395) and the disclosure of the 
gene amplification assay in Example 1 70, the skilled artisan would know exactly how to make 
and use the claimed polypeptide for the diagnosis of colon cancers and use antibodies to the 
polypeptides for the diagnosis of colon cancer. Appellants submit that based on the detailed 
information presented in the specification and the advanced state of the art in oncology, the 
skilled artisan would have found such testing routine and not 'undue.' Thus, barring evidence to 
the contrary, Appellants maintain that the fold amplification disclosed for the PRO 1281 gene is 
significant and forms the basis for the utility for the claimed antibodies to the PRO 1281 
polypeptide. 

Therefore, since the instantly claimed invention is supported by either a credible, specific 
and substantial asserted utility or a well-established utility, and since the present specification 
clearly teaches one skilled in the art "how to make and use" the claimed invention without undue 
experimentation, Appellants respectfully request reconsideration and reversal of this outstanding 
rejections under 35 U.S.C. §101 and §112, First Paragraph to Claims 1 19-121 and 123. 

Issue 3a: Claim 119-121 and 123 are not Anticipated Under 35 U.S.C. §102 (a) Over Baker 
rWO99/63088 - dated December 1999) 

Claims 119-121 and 123 remain rejected under 35 U.S.C. §102 (a) as being anticipated 
by the claims of Baker (WO99/63088 - dated December 1999) . 
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For the reasons discussed above, Appellants maintain that they are entitled to a priority 
date of June 23, 1999 for the instant application. The Examiner says that the priority 
application lacks utility and hence this priority date cannot be given. For the very same reasons 
discussed above under Issue 1 : Utility/ Enablement, Appellants submit that the disclosure of the 
priority application, U.S. Provisional Patent Application Serial No. 60/141,037 filed 
June 23, 1999 has utility and is enabled. Therefore, the priority date should be accorded and 
hence, application WO99/63088 is not prior art. Hence, Appellants request withdrawal of this 
rejection. 

Issue 3b: Claim 119-121 and 123 are not Anticipated Under 35 U.S.C. §102(a) Over by 
Tang (WO 01/53312 - Dated July 2001) 

Claims 1 19-121 and 123 remain rejected under 35 U:S.C. §102 (a) as being anticipated 
by the claims of Tang (WO 01/53312 - dated July 2001). 

For the reasons discussed above, Appellants maintain and believe that they are entitled to 
the effective priority date of June 23, 1999 of U.S. Provisional Patent Application Serial 
No. 60/141,037. Thus, the reference Tang et al. dated 2001 is not prior art under 35 U.S.C. 
§ 102(a). Therefore, the present claims are not anticipated by Tang et al, and hence, this 
rejection under 35 U.S.C. §102(a) should be withdrawn. 

Issue 4: Claims 119-121 and 123 are Patentable Under 35 U.S.C. §103(a) Over Weimann 
(2001) in View of Tang et al. (WO 01/53312 - Dated July 2001) 

Claims 1 19-121 and 123 remain rejected under 35 U.S.C. §103(a) over Weimann (2001) 
in view of Tang et al. (WO 01/53312 - dated July 2001) 

For the reasons discussed above, Appellants maintain and believe that they are entitled to 
the effective priority date of June 23, 1999 of U.S. Provisional Patent Application Serial 
No. 60/141,037. Thus, the primary reference Weimann dated 2001 is not prior art and neither is 
the Tang reference. Therefore, present claims are not obvious over Weimann in view of Tang 
and hence, this rejection under 35 U.S.C. § 103(a) should be withdrawn. 

CONCLUSION 

For the reasons given above, Appellants submit that present specification clearly 
describes, details and provides a patentable utility for the claimed invention. Moreover, it is 
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respectfully submitted that based upon this disclosed patentable utility, the present specification 
clearly teaches "how to use" the presently claimed polypeptide. As such, Appellants respectfully 
request reconsideration and reversal of the outstanding rejection of Claims 119-121 and 123. 
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CLAIMS APPENDIX 



Claims on Appeal 



119. An antibody, or fragment thereof, that specifically binds to the polypeptide of 
SEQ ID NO: 326. 

120. The antibody of Claim 119 which is a monoclonal antibody. 

121. The antibody of Claim 119 which is a humanized antibody. 
123. The antibody of Claim 119 which is labeled. 
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X. EVIDENCE APPENDIX 

1 . Declaration of Paul Polakis, Ph.D. under 35 C.F.R. § 1 .1 32. 

2. Declaration of Avi Ashkenazi, Ph.D. under 35 C.F.R. §1.132, with attached 
Exhibit A (Curriculum Vitae). 

3. Declaration of Audrey Goddard, Ph.D. under 35 C.F.R. 1.132, with attached 
Exhibits A-G: 

A. Curriculum Vitae of Audrey D. Goddard, Ph.D. 

B. Higuchi, R. et al., "Simultaneous amplification and detection of specific DNA 
sequences," Biotechnology 10:413-417 (1992). 

C. Livak, K.J., et al., "Oligonucleotides with fluorescent dyes at opposite ends 
provide a quenched probe system useful for detecting PCR product and nucleic 
acid hybridization," PCR Methods Appl. 4:357-362 (1995). 

D. Heid, C.A. et al., "Real time quantitative PCR," Genome Res. 6:986-994 (1996). 

E. Pennica, D. et al., "WISP genes are members of the connective tissue growth 
factor family that are up-regulated in Wnt- 1 -transformed cells and aberrantly 
expressed in human colon tumors," Proc. Natl. Acad. Sci. USA 95:14717-14722 
(1998). 

F. Pitti, R.M. et al., "Genomic amplification of a decoy receptor for Fas ligand in 
lung and colon cancer," Nature 396:699-703 (1998). 

G. Bieche, I. et al., "Novel approach to quantitative polymerase chain reaction using 
real-time detection: Application to the detection of gene amplification in breast 
cancer," Int. J. Cancer 78:661 -666 (1998). 

4. Orntoft, T.F., et al., "Genome-wide Study of Gene Copy Numbers, Transcripts, 
and Protein Levels in Pairs of Non-Invasive and Invasive Human Transitional Cell Carcinomas," 
Molecular & Cellular Proteomics 1:37-45 (2002). 

5. Hyman, E., et al., "Impact of DNA Amplification on Gene Expression Patterns in 
Breast Cancer," Cancer Research 62:6240-6245 (2002). 

6. Pollack, J.R., et al., "Microarray Analysis Reveals a Major Direct Role of DNA 
Copy Number Alteration in the Transcriptional Program of Human Breast Tumors," Proc. Natl. 
Acad. Sci. USA 99:12963-12968 (2002). 

7. Hanna et al., "HER-2/neu Breast Cancer Predictive Testing," Pathology 
Associates Medical Laboratories (1999). 
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8. Pennica et al., "WISP genes are members of the connective tissue growth factor 
family that are up-regulated in Wnt-1 -transformed cells and aberrantly expressed in human colon 
tumors," Proc. Natl. Acad. Sci. USA 95:14717-14722 (1998). 

9. Konopka et al., "Variable expression of the translocated c-abl oncogene in 
Philadelphia-chromosome-positive B-lymphoid cell lines from chronic myelogenous leukemia 
patients," Proc Natl Acad Sci USA. 83:4049-52 (1986). 

10. Haynes et al., "Proteome analysis: Biological assay or data archive?" 
Electrophoresis 19:1862-1871 (1996). 

1 1 . Baker et al., (WO99/63088 - dated December 1999). 

12. Tang et al., (WO01/53312 - dated July 2001). 

13. Weimaxm etal., (2001). 

14. Sen S., "Aneuploidy and Cancer", Current Opinion in Oncology, 12: 82-88, 

(2000). 

15. Pages 303-306 of specification which were missing in specification. Requested 
by Examiner on Page 6 of the Final Office Action mailed August 1 1 , 2005. 

Items 1, 2 and 4-7 were submitted with Appellants' Response filed June 4, 2004, and were 
considered by the Examiner as indicated in the Final Office action mailed June 28, 2004. 

Item 3 was submitted with Appellants' Response filed July 22, 2005, and was considered by the 
Examiner as indicated in the Final Office action mailed August 1 1 , 2005. 

Items 8-10 were made of record by the Examiner in the Final Office Action mailed 
June 28, 2004. 

Items 11-13 were made of record by the Examiner in the Office Action mailed March 9, 2004. 
Item 14 was made of record by the Examiner in the Final Office Action mailed August 11, 2005. 
Item 15 was requested by the Examiner in the Final Office Action mailed August 11, 2005. 
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None. 
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DECLARATION OF PAUL POLAKIS, Ph.D. 
I, Paul Polakis, Ph.D.y'declare and say as follows: 

1 . I was awarded a Ph.D. by the Department of Biochemistry of the Michigan 
State University in 1984. My scientific Curriculum Vitae is attached to and forms 
part of this Declaration (Exhibit A). 

2. I am currently employed by Genentech, Inc. where my job title is Staff 
Scientist. Since joining Genentech in 1999, one of my primary responsibilities has 
been leading Genentech's Tumor Antigen Project, which is a large research project 
with a primary focus on identifying tumor cell markers that find use as targets for 
both the diagnosis and treatment of cancer in humans. 

3. As part of the Tumor Antigen Project, my laboratory has been analyzing 
differential expression of various genes in tumor cells relative to normal cells. 
The purpose of this research is to identify proteins that are abundantly expressed 
on certain tumor cells and that are either (i) not expressed, or (ii) expressed at 
lower levels, on corresponding normal cells. We call such differentially expressed 
proteins "tumor antigen proteins", When such a tumor antigen protein is 
identified, one can produce an antibody that recognizes and binds to that protein. 
Such an antibody finds use in the diagnosis of human cancer and may ultimately 
serve as an effective therapeutic in the treatment of human cancer. 

4 T p the rnnrsft nf the research con ducted bv Genentech's Tumor Antigen 

Project, we have employed a variety of scientific techniques for detecting and 
studying differential gene expression in human tumor cells relative to normal cells, 
at genomic DNA, mRNA and protein levels. An important example of one such 
technique is the well known and widely used technique of microarray analysis 
which has proven to be extremely useful for the identification of mRNA molecules 
that are differentially expressed in one tissue or cell type relative to another. In the 
course of our research using microarray analysis, we have identified 
approximately 200 gene transcripts that are present in human tumor cells at 
significantly higher levetethan in corresponding normal human cells; To date, we— 
have generated antibodies that bind to about 30 of the tumor antigen proteins 
expressed from these differentially expressed gene transcripts and have used these 
antibodies to quantitatively determine the level of production of these tumor 
antigen proteins in both human cancer cells and corresponding normal cells. We 
have then compared the levels of mRNA and protein in both the tumor and normal 
cells analyzed. 

5. From the mRNA and protein expression analyses described in paragraph 4 
above, we have observed that there is a strong correlation between changes in the 
level of mRNA present in any particular cell type and the level of protein 



expressed from that mRNA in that cell type. In approximately 80% of our 
observations we have found that increases in the level of a particular mRNA 
correlates with changes in the level of protein expressed from that mRNA when 
human tumor cells are compared with their corresponding normal cells. 

6. Based upon my own experience accumulated in more than 20 years of 
research, including the data discussed in paragraphs 4 and 5 above and my 
knowledge of the relevant scientific literature, it is my considered scientific 
opinion that for human genes, an increased level of mRNA in a tumor cell relative 
to a normal cell typically correlates to a similar increase in abundance of the 
encoded protein in the tumor cell relative to the normal cell. In fact, it remains a 
central dogma in molecular biology that increased mRNA levels are predictive of 
corresponding increased levels of the encoded protein. While there have been 
published reports of genes for which such a correlation does not exist, it is my 
opinion that such reports are exceptions to the commonly understood general rule 
that increased mRNA levels are predictive of corresponding increased levels of the 
encoded protein. 

7. I hereby declare that all statements made herein of my own knowledge are 
true and that all statements made on information or belief are believed to be true, 
and further that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment, or both, 
under Section 1001 of Title 18 of the United States Code and that such willful 
statements may jeopardize the validity of the application or any patent issued 
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Induction of apoptosis by Apo-2 Ligand, a new member of the tumor necrosis 
factor cytokine family. J. Biol. Chem. Ill, 12687-12690 (1996). 



5 




40. Marsters, S., Pitti, R, Donahue, C, Rupert, S., Bauer, K., and Ashkenazi, A. 
Activation of apoptosis by Apo-2 ligand is independent of FADD but blocked by 
CrmA. Curr. Biol. 6, 1669-1676 (1996). 

41 . Marsters, S., Skubatch, M., Gray, C, and Ashkenazi. A . Herpesvirus entry 
mediator, a novel member of the tumor necrosis factor receptor family, activates 
the NF-kB and AP-1 transcription factors. /. Biol. Chem. 272, 14029-14032 
(1997). 

42. Sheridan, J., Marsters, S., Pitti, R, Gurney, A., Skubatch, M., Baldwin, D., 
Ramakrishnan, L., Gray, C, Baker, K., Wood, W.I., Goddard, A., Godowski, P., and 
Ashkenazi. A. Control of TRAIL-induced apoptosis by a family of signaling and 
decoy receptors. Science 277, 818-821 (1997). 

43. Marsters, S., Sheridan, J., Pitti, R., Gurney, A., Skubatch, M., Balswin, D., Huang, A., 
Yuan, J., Goddard, A., Godowski, P., and Ashkenazi. A. A novel receptor for 
Apo2L/TRAIL contains a truncated death domain. Curr. Biol. 7, 1003-1006 (1997). 

44. Marsters, A., Sheridan, J., Pitti, R, Brush, J., Goddard, A., and Ashkenazi, A. 
Identification of a ligand for the death-domam-containing receptor Apo3. Curr. Biol 
8,525-528(1998). 

45. Rieger, J., Naumann, U., Glaser, T., Ashkenazi. A ., and Weller, M. Apo2 ligand: 
anovel weapon against malignant glioma? FEBS Lett. 427 ', 124-128 (1998). 

46. Pender, S., Fell, J., Chamow, S., Ashkenazi, A ., and MacDonald, T. A p55 TNF 
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47. Pitti, R, Marsters, S., Lawrence, D., Roy, Kischkel, F., M., Dowd, P., Huang, A., 
Donahue, C, Sherwood, S., Baldwin, D., Godowski, P., Wood, W., Gurney, A., 
Hilhn, K , Cohen, R , n^farA A J Botstein, P., and Ashkenazi. A. Genomic 
amplification of a decoy receptor for Fas ligand in lung and colon cancer. Nature 
396,699-703(1998). 

48. Mori, S., Marakami-Mori, K., Nakamura, S., Ashkenazi. A ., and Bonavida, B. 
Sensitization of AIDS Kaposi's sarcoma cells to Apo-2 ligand-induced apoptosis 
by actinomycin D. J. Immunol. 162, 5616-5623 (1999). 

49. Gurney, A. Marsters, S., Huang, A., Pitti, R, Mark, M., Baldwin, D., Gray, A., 
Dowd, P., Brush, J., Heldens, S., Schow, P., Goddard, A., Wood, W., Baker, K., 

Godowski, P., and Ashkenazi. A.- Identification of a new member of the tumor 

necrosis factor family and its receptor, a human ortholog of mouse GITR. Curr. 

Biol. 9, 215-218 (1999). 



6 



50. Ashkenazi. A ., Pai, R., Fong, s., Leung, S., Lawrence, D., Marsters, S., Blackie, 
C, Chang, L., McMurtrey, A., Hebert, A., DeForge, L., Khoumenis, I., Lewis, D., 
Harris, L., Bussiere, J., Koeppen, H., Shahrokh, Z., and Schwall, R Safety and 
anti-tumor activity of recombinant soluble Apo2 ligand /. Clin. Invest. 104, 155- 
162(1999). 

51. Chuntharapai, A., Gibbs, V., Lu, J., Ow, A., Marsters, S., Ashkenazi, A., De Vos, 
A., Kim, K.J. Determination of residues involved in ligand binding and signal 
transmissiion in the human IFN-a receptor 2. J. Immunol. 163, 766-773 (1999). 

52. Johnsen, A.-C, Haux, J., Steinkjer, B., Nonstad, U., Egeberg, K., Sundan, A., 
Ashkenazi. A., and Espevik, T. Regulation of Apo2L/TRAIL expression in NK 
cells - involvement in NK cell-mediated cytotoxicity. Cytokine 1 1 , 664-672 
(1999). 

53. Roth, W., Isenmann, S., Naumann, U., Kugler, S., Bahr, M., Dichgans, 

. Ashkenazi. A., and Weller, M. Eradication of intracranial human malignant 
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Inhibition of erythroid colony formation in vitro by gamma interferon. In 
Molecular Biology of Hematopoiesis (N. Abraham, R. Shadduck, A. Levine F. 
Takaku, eds.) Intercept Ltd. Paris, Vol. 3, p. 135-147 (1994). 

7. Ashkenazi. A . Cytokine neutralization as a potential therapeutic approach for 
SIRS and shock. J. Biotechnology in Healthcare 1, 197-206 (1994). 

8. Ashkenazi, A ., and Chamow, S. M. Immunoadhesins: an alternative to human 
monoclonal antibodies. Immunomethods: A companion to Methods in 
Enzimology 8, 104-115 (1995). 

9. Chamow, S., and Ashkenazi, A . Immunoadhesins: Principles and Applications. 
Trends Biotech. 14, 52-60 (1996). 

10. Ashkenazi. A ., and Chamow, S. M. Immunoadhesins as research tools and 
therapeutic agents. Curr. Opin. Immunol. 9, 195-200 (1997). 

11. Ashkenazi, A ., and Dixit, V. Death receptors: signaling and modulation. Science 
281, 1305-1308(1998). 

12. Ashkenazi, A ., and Dixit, V. Apoptosis control by death and decoy receptors. 
Curr. Opin. Cell. Biol. 11,255-260 (1999). 
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13. Ashkenazi, A . Chapters on Apo2L/TRAEL; DR4, DR5, DcRl, DcR2; and DcR3. 
Online Cytokine Handbook rwww.apnet.com/cvtold nerefeirace/). 

1 4. Ashkenazi A . Targeting death and decoy receptors of the tumor necrosis factor 
superfamily. Nature Rev. Cancer 2, 420-430 (2002). 

15. LeBlanc, H. and Ashkenazi A . Apoptosis signaling by Apo2L/TRAIL. Cell Death 
and Differentiation 10, 66-75 (2003). 

16. Almasan, A. and Ashkenazi. A . Apo2L/TRAIL: apoptosis signaling, biology, and 
potential for cancer therapy. Cytokine and Growth Factor Reviews 14, 337-348 
(2003). 

Book: 

Antibody Fusion Proteins (Chamow, S., and Ashkenazi. A ., eds., John Wiley and 
Sons Inc.) (1999). 

Talks: 

1 . Resistance of primary HTV isolates to CD4 is independent of CD4-gp 1 20 binding 
affinity. UCSD Symposium, HTV Disease: Pathogenesis and Therapy. 
Greenelefe, FL, March 1991. 

2. Use of immuno-hybrids to extend the half-life of receptors. JBC conference on 
Biopharmaceutical Halflife Extension. New Orleans, LA, June 1992. 

3. Results with TNF receptor hnmunoadhesins for the Treatment of Sepsis. IBC 
conference on Endotoxemia and Sepsis. Philadelphia, PA, June 1992. 

4. hnmunoadhesins: an alternative to human antibodies. JJBC Conference on 
Antibody Engineering. San Diego, C A, December 1993. 

5. Tumor necrosis factor receptor: a potential therapeutic for human septic shock. 
American Society for Microbiology Meeting, Atlanta, GA, May 1993 . 

6. Protective efficiacy of TNF receptor immunoadhesin vs anti-TNF monoclonal 
antibody in a rat model for endotoxic shock. 5th International Congress on TNF. 
Asilomar, CA, May 1994. 

' 7. Interferon-y signals via a multisubunit receptor complex that contains two types of 
polypeptide chain. American Association of Lnmunologists Conference. San 
Franciso, CA, July 1995. 

8. hnmunoadhesins: Principles and Applications. Gordon Research Conference on 
Drug Delivery in Biology and Medicine. Ventura, CA, February 1996. 
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9. Apo-2 Ligand, a new member of the TNF family that induces apoptosis in tumor 
cells. Cambridge Symposium on TNF and Related Cytokines in Treatment of 
Cancer. Hilton-Head, NC, March 1996. 

10. Induction of apoptosis by Apo2 Ligand. American Society for Biochemistry and 
Molecular Biology, Symposium on Growth Factors and Cytokine Receptors. New 
Orleans, LA, June, 1996. 

1 1. Apo2 ligand, an extracellular trigger of apoptosis. 2nd Clontech Symposium, 
Palo Alto, CA, October 1996. 

12. Regulation of apoptosis by members of the TNF ligand and receptor families. 
Stanford University School of Medicine, Palo Alto, CA, December 1 996. 

13. Apo-3: anovel receptor that regulates cell death and inflammation. 4th 
International Congress on Immune Consequences of Trauma, Shock, and Sepsis. 
Munich, Germany, March 1997. 

1 4. New members of the TNF ligand and receptor families that regulate apoptosis, 
inflammation, and immunity. UCLA School of Medicine, LA, CA, March 1997. 

15. Immunoadhesins: an alternative to monoclonal antibodies. 5th World Conference 
on Bispecific Antibodies. Volendam, Holland, June 1997. 

16. Control of Apo2L signaling. Cold Spring Harbor Laboratory Symposium on 
Programmed Cell Death. Cold Spring Harbor, New York. September, 1997. 

17. Chairman and speaker, Apoptosis Signaling session. BBC's 4th Annual 
Conference on Apoptosis. San Diego, CA., October 1997. 

1 8 . Control of Apo2L signaling by death and decoy receptors. American Association 
for the Advancement of Science. Philadelphia, PA, February 1998. 

19. Apo2 ligand and its receptors. American Society of Immunologists. San 
Francisco, CA, April 1998. 

20. Death receptors and ligands. 7th International TNF Congress. Cape Cod, MA, 

May 1998. 

21. Apo2L as a potential therapeutic for cancer. UCLA School of Medicine. LA, 
. CA, June 1998. 

22. Apo2L as a potential therapeutic for cancer. Gordon Research Conference, on 
Cancer Chemotherapy. New London, NH, July 1998. 

23 : Control of apoptosis by Apo2L. Endocrine Society Conference, Stevenson, WA, 
August 1998. 

24. Control of apoptosis by Apo2L. International Cytokine Society Conference, 
Jerusalem, Israel, October 1998. 



11 




25. Apoptosis control by death and decoy receptors. American Association for 
Cancer Research Conference, Whistler, BC, Canada, March 1 999. 

26. Apoptosis control by death and decoy receptors. American Society for 
Biochemistry and Molecular Biology Conference, San Francisco, CA, May 1999. 

27. Apoptosis control by death and decoy receptors. Gordon Research Conference on 
Apoptosis, New London, NH, June 1999. 

28. Apoptosis control by death and decoy receptors. Arthritis Foundation Research 
Conference, Alexandria GA, Aug 1999. 

29. Safety and anti-tumor activity of recombinant soluble Apo2L/TRAIL. Cold 
Spring Harbor Laboratory Symposium on Programmed Cell Death. . Cold Spring 
Harbor, NY, September 1999. 

30. The Apo2L/TRAIL system: therapeutic potential. American Association for 
Cancer Research, Lake Tahoe, NV, Feb 2000. 

3 1 . Apoptosis and cancer therapy. Stanford University School of Medicine, Stanford, 
CA, Mar 2000. 

32. Apoptosis and cancer therapy. University of Pennsylvania School of Medicine, 
Philadelphia, PA, Apr 2000. 

33 . Apoptosis signaling by Apo2L/TRAJL. International Congress on TNF. 
Trondheim, Norway, May 2000. 

34. The Apo2L/TRAJL system: therapeutic potential. Cap-CURE summit meeting. 

Santa Monica, CA, June 2000. 

35. The Apo2L/TRAJL system: therapeutic potential. MD Anderson Cancer Center. 
Houston, TX, June 2000. 

36. Apoptosis signaling by Apo2I/TRAJJL. The Protein Society, 14 th Symposium. 
San Diego, C A, August 2000. 

37. Anti-tumor activity of Apo2L/TRAJL. AAPS annual meeting. Indianapolis, IN 
Aug 2000. 

38. Apoptosis signaling and anti-cancer potential of Apo2L/TRAIL. Cancer Research 
Institute, UC San Francisco, CA, September 2000. 

39 Apoptosis signaling by Apo2L/TRAIL. Kenote address, TNF family 
Minisymposium, NJH. Bethesda, MD, September 2000. 

40. Death receptors: signaling and modulation. Keystone symposium on the 
Molecular basis of cancer. Taos, NM, Jan 2001. 

41. Preclinical studies of Apo2L/TRAJL in cancer. Symposium on Targeted therapies 
in the treatment of lung cancer. Aspen, CO, Jan 2001. 



12 




42. Apoptosis signaling by Apo2L/TRAIL. Wiezmann Institute of Science, Rehovot, 
Israel, March 2001. 

43. Apo2L/TRAIL: Apoptosis signaling and potential for cancer therapy. Weizmann 
Institute of Science, Rehovot, Israel, March 2001. 

44. Targeting death receptors in cancer with Apo2L/TRAIL. Cell Death and Disease 
conference, North Falmouth, MA, Jun 2001. 

45. Targeting death receptors in cancer with Apo2L/TRAIL. Biotechnology 
Organization conference, San Diego, CA, Jun 2001. 

46. Apo2L/TRAIL signaling and apoptosis resistance mechanisms. Gordon Research 
Conference on Apoptosis, Oxford, UK, July 2001. 

47. Apo2L/TRAIL signaling and apoptosis resistance mechanisms. Cleveland Clinic 
Foundation, Cleveland, OH, Oct 2001. 

48. Apoptosis signaling by death receptors: overview. International Society for 
Interferon and Cytokine Research conference, Cleveland, OH, Oct 2001 . 

49. • Apoptosis signaling by death receptors. American Society of Nephrology 

Conference. San Francisco, CA, Oct 2001. 

50. Targeting death receptors in cancer. Apoptosis: commercial opportunities. San 
Diego, CA, Apr 2002. 

51 Apo2L/TRAIL signaling and apoptosis resistance mechanisms. Kirnmel Cancer 
Research Center, Johns Hopkins University, Baltimore MD. May 2002. 

52. Apoptosis control by Apo2IVTRAIL. (Keynote Address) University of Alabama 
Cancer Center Retreat, Birmingham, Ab. October 2002. 

53. Apoptosis signaling by Apo2L/TRAJJL. (Session co-chair) TNF international, 
conference. San Diego, CA. October 2002. 

54. Apoptosis signaling by Apb2L/TRAIL. Swiss Institute for Cancer Research 
(ISREC). Lausanne, Swizerland. Jari 2003. 

55. Apoptosis induction with Apo2L/TRAJX. Conference on New Targets and 
Innovative Strategies in Cancer Treatment. Monte Carlo. February 2003. 

56. Apoptosis signaling by Apo2L/TRAIL. Hermelin Brain Tumor Center 
Symposium on Apoptosis. Detroit, MI. April 2003. 

57. Targeting apoptosis through death receptors. Sixth Annual Conference on 
Targeted Therapies in the Treatment of Breast Cancer. Kona, Hawaii. July 2003 . 

58. Targeting apoptosis through death receptors. Second International Conference on 
Targeted Cancer Therapy. Washington, DC. Aug 2003. 

Issued Patents: 
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1 . Ashkenazi, A., Chamow, S. and Kogan, T. Carbohydrate-directed crosslinking 
reagents. US patent 5,329,028 (Jul 12, 1994). 

2. Ashkenazi, A, Chamow, S. and Kogan, T. Carbohydrate-directed crosslinking 

reagents. US patent 5,605,791 (Feb 25, 1997). 

3 . Ashkenazi, A, Chamow, S. and Kogan, T. Carbohydrate-directed crosslinking 
reagents. US patent 5,889,155 (Jul 27, 1999). 

4: Ashkenazi, A., APO-2 Ligand. US patent 6,030,945 (Feb 29, 2000). 

5. Ashkenazi, A., Chuntharapai, A, Kim, J., APO-2 ligand antibodies. US patent 6, 
046, 048 (Apr 4, 2000). 

6. Ashkenazi, A, Chamow, S. and Kogan, T. Carbohydrate-directed crosslinking 
reagents. US patent 6,124,435 (Sep 26, 2000). 

7. Ashkenazi, A, Chuntharapai, A., Kim, J., Method for making monoclonal and cross- 
reactive antibodies. US patent 6,252,050 (Jun 26, 2001). 

8. Ashkenazi, A. APO-2 Receptor. US patent 6,342,369 (Jan 29, 2002). 

9. Ashkenazi, A Fong, S., Goddard, A, Gumey, A., Napier, M., Tumas, D., Wood, W. 
A-33 polypeptides. US patent 6,410,708 (Jun 25, 2002). 

10. Ashkenazi, A. APO-3 Receptor. US patent 6,462,176 Bl (Oct 8, 2002). 

1 1. Ashkenazi, A. APO-2LI and APO-3 polypeptide antibodies. US patent 6,469,144 Bl 
(Oct 22, 2002). 

12. Ashkenazi, A., Chamow, S. and Kogan, T. Carbohydrate-directed crosslinking 
reagents. US patent 6,582,928B1 (Jun 24, 2003). 
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PATENT 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re Application of: Ashkenazi et al. 


Group Art Unit: 1647 


Serial No.: 09/903,925 


Examiner: Fozia Hamid 


Filed: July 11,2001 
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T>T7rT Att ATTON OF AUDREY D. GODDARD. Pb .D UNDER 37 C.F.R. § 1.132 

Assistant Commissioner of Patents 
Washington, DC. 20231 



Sir: 

1. Audrey D. Goddard, Ph.D. do hereby declare and say as follows: 

1 I am a Senior Clinical Scientist at the Experimental Medicine/BioOncology, Medical 
Affairs Department of Genentech, Inc., South San Francisco, California 94080. 

2. Between 1 993 and 200 1 , 1 headed the DNA Sequencing Laboratory at the Molecular 
Biology Department of Genentech, Inc. During this time, my responsibilities included the 
identification and characterization of genes contributing to the oncogenic process, and determination 
of the chromosomal localization of novel genes. 

3. My scientific Curriculum Vitae, including my list of publications, is attached to and 
forms part of this Declaration (Exhibit A). 
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Serial No.: * 
Filed: * 

4. I am familiar with a variety of techniques known in the art for detecting and 
quantifying the amplification of oncogenes in cancer, including the quantitative TaqMan PCR (i.e., 
"gene amplification") assay described in the above captioned patent application. 

5. The TaqMan PCR assay is described, for example, in the following scientific 
publications: Higuchi et al, Biotechnology 10:413-417 (1992) (Exhibit B); Livak et al, PCR 
Methods AppL 4:357-362 (1995) (Exhibit C) and Heid et al, Genome Res. 6:986-994 (1996) 
(Exhibit D). Briefly, the assay is based on the principle that successful PCR yields a fluorescent 
signal due to Taq DNA polymerase-mediated exonuclease digestion of a fluorescently labeled 
oligonucleotide that is homologous to a sequence between two PCR primers. The extent of 
digestion depends directly on the amount of PCR, and can be quantified accurately by measuring the 
increment in fluorescence that results from decreased energy transfer. This is an extremely sensitive 
technique, which allows detection in the exponential phase of the PCR reaction and, as a result, 
leads to accurate determination of gene copy number. 

6. The quantitative fluorescent TaqMan PCR assay has been extensively- and 
successfully used to characterize genes involved in cancer development and progression. 
Amplification of protooncogenes has been studied in a variety of human tumors, and is widely 
considered as having etiological, diagnostic and prognostic significance. This use of the quantitative 
TaqMan PCR assay is exemplified by the following scientific publications: Pennica et al, Proc, 
Natl Acad. Sci. USA 95(25):14717-14722 (1998) (Exhibit E); Pitti et al, Nature 
396(671 2):699-703 (1998) (Exhibit F) andBieche et al. . Int. J. Cancer 78:661-666 (1998) (Exhibit 
G), the first two of which I am co-author. In particular, Pennica et al have used the quantitative 
TaqMan PCR assay to study relative gene amplification of WISP and c-myc in various cell lines, 
colorectal tumors and normal mucosa. Pitti et al. studied the genomic amplification of a decoy 
receptor for Fas ligand in lung and colon cancer, using the quantitative TaqMan PCR assay. Bieche 
et al. used the assay to study gene amplification in breast cancer. 
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Serial No.: * 
Filed:* 

7. It is my personal experience that the quantitative TaqMan PCR technique is 
technically sensitive enough to detect at least a 2-fold increase in gene copy number relative to 
control. It is further my considered scientific opinion that an at least 2-fold increase in gene copy 
number in a tumor tissue sample relative to a normal (i.e., non-tumor) sample is significant and 
useful in that the detected increase in gene copy number in the tumor sample relative to the normal 
sample serves as a basis for using relative gene copy number as quantitated by the TaqMan PCR 
technique as a diagnostic marker for the presence or absence of tumor in a tissue sample of unknown • 
pathology. Accordingly, a gene identified as being amplified at least 2-fold by the quantitative 
TaqMan PCR assay in a tumor sample relative to a normal sample is useful as a marker for the 
diagnosis of cancer, for monitoring cancer development and/or for measuring the efficacy of cancer 
therapy. 

8. 1 declare further that all statements made herein of my own knowledge are true and 
that all statements made on information and belief are believed to be true. I declare that these 
statements were made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code, and that such willful false statements may jeopardize the validity of the application or any 
patent issuing thereon. 

Date 



Audrey D. Goddard, Ph.D. 
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AUDREY D. GODDARD, Ph.D. 



Genentech. Inc. 110 Congo St ^ 

1 DNA Way San Francisc0 - CA - 94131 

South San Francisco, CA, 94080 V^VnV^ , wi n 

R50 225 6429 41 5.81 9.2247 (mobile) 

goddarda@gene.corn agoddard@pacbell.net 



PROFESSIONAL EXPERIENCE 

_ . . . 1993-present 
Genentech, Inc. 

South San Francisco, CA 
2001 - present Senior Clinical Scientist 

Experimental Medicine / BioOncology, Medical Affairs 

Responsibilities: 

• Companion diagnostic oncology products ., £l , t - , M „ h 

. Acquisition of clinical samples from Genentech's clinical tnals for translation^ research 
. Translational research using clinical specimen and data for drug development and 

. Sbirof Development Science Review Committee, Diagnostic Oversight Team, 21 CFR 
Part 1 1 Subteam 

^EthSl and legal implications of experiments with clinical specimens and data 
. Application of pharmacogenomics in clinical trials 

1998 -2001 Senior Scientist 

Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 

T^gTmeTtof a laboratory of up to nineteen -including postdoctoral fellow, associate 
scientist, senior research associate and research assistants/associate levels 

• Management of a $750K budget 

. DNA sequencing core facility supporting a 350+ person research facility. 

. DNA sequencing for high throughput gene discovery, - ESTs, cDNAs, and constructs 

• Genomic sequence analysis and gene identification 

• DNA sequence and primary protein analysis 

Research: 

• Chromosomal localization of novel genes 

. Identification and characterization of genes contributing to the oncogenic process 
. Identification and characterization of genes contributing to inflammatory diseases 
. Design and development of schemes for high throughput genomic DNA sequence analysis 

• Candidate gene prediction and evaluation 
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1993 - 1998 Scientist 

Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 
Responsibilities 

• DNA sequencing core facility supporting a 350+ person research facility 

• Assumed responsibility for a pre-existing team of five technicians and expanded the group 
into fifteen, introducing a level of middle management and additional areas of research 

• Participated in the development of the basic plan for high throughput secreted protein 
discovery program - sequencing strategies, data analysis and tracking, database design 

. High throughput EST and cDNA sequencing for new gene identification. 

• Design and implementation of analysis tools required for high throughput gene identification. 

• Chromosomal localization of genes encoding novel secreted proteins. 

Research: 

• Genomic sequence scanning for new gene discovery. 

• Development of signal peptide selection methods. 

• Evaluation of candidate disease genes. 

• Growth hormone receptor gene SNPs in children with Idiopathic short stature 

Imperial Cancer Research Fund 1989-1992 
London, UK with Dr. Ellen Solomon 

6/89-12/92 Postdoctoral Fellow 

. Cloning and characterization of the genes fused at the acute promyelocyte leukemia 
translocation breakpoints on chromosomes 17 and 15. 

• Prepared a successfully funded European Union multi-center grant application 



McMaster University 

Hamilton, Ontario, Canada with Dr. G. D. Sweeney 
5/83 - 8/83: NSERC Summer Student 

• In vitro metabolism of p-naphthoflavone in C57BI/6J and DBA mice 



EDUCATION 
Ph.D. 

"Phenotypic and genotypic effects of mutations in 
the human retinoblastoma gene." 
Supervisor: Dr. R. A. Phillips 

Honours B.Sc 

"The in vitro metabolism of the cytochrome P-448 
inducer p-naphthoflavone in C57BL/6J mice." 
Supervisor: Dr. G. D. Sweeney 



University of Toronto 

Toronto, Ontario, Canada. 1989 

Department of Medical 

Biophysics. 

McMaster University, 

Hamilton, Ontario, Canada. 1983 

Department of Biochemistry 
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ACADEMIC AWARDS 

Imperial Cancer Research Fund Postdoctoral Fellowship 1989-1992 

Medical Research Council Studentship 1983-1988 

NSERC Undergraduate Summer Research Award 1983 

Society of Chemical Industry Merit Award (Hons. Biochem.) 1983 

Dr. Harry Lyman Hooker Scholarship 1 981 -1 983 

J.LW. Gill Scholarship 1981-1982 

Business and Professional Women's Club Scholarship 1 980-1 981 

Wyerhauser Foundation Scholarship 1979-1980 



INVITED PRESENTATIONS 

Genentech's gene discovery pipeline: High throughput identification, cloning and 
characterization of novel genes. Functional Genomics: From Genome to Function, Litchfield 
Park, AZ, USA. October 2000 

High throughput identification, cloning and characterization of novel genes. G2K:Back to 
Science, Advances in Genome Biology and Technology I. Marco Island, FL, USA. February . 
2000 

Quality control in DNA Sequencing: The use of Phred and Phrap. Bay Area Sequencing 
Users Meeting, Berkeley, CA, USA. April 1999 

High throughput secreted protein identification and cloning. Tenth International Genome 
Sequencing and Analysis Conference, Miami, FL, USA. September 1998 
The evolution of DNA sequencing: The Genentech perspective. Bay Area Sequencing Users 
Meeting, Berkeley, CA, USA. May 1998 

Partial Growth Hormone Insensitivity: The role of GH-receptor mutations in Idiopathic Short 
Stature. Tenth Annual National Cooperative Growth Study Investigators Meeting, San 
Francisco, CA, USA. October, 1996 

Growth hormone (GH) receptor defects are present in selected children with non-GH-deficient 
short stature: A molecular basis for partial GH-insensitivity. 76 Annual Meeting of The 
Endocrine Society, Anaheim, CA, USA. June 1994 

A previously uncharacterized gene, myl, is fused to the retinoic acid receptor alpha gene in 
acute promyelocyte leukemia. XV International Association for Comparative Research on 
Leukemia and Related Disease, Padua, Italy. October 1991 
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PATENTS 

Goddard A, Godowski PJ, Gurney AL. NL2 Tie ligand homologue polypeptide. Patent 
Number: 6,455,496. Date of Patent: Sept. 24, 2002. 

Goddard A, Godowski PJ and Gurney AL. NL3 Tie ligand homologue nucleic acids. Patent 
Number: 6,426,218. Date of Patent: July 30, 2002. 

Godowski P Gurney A, Hillan KJ, Botstein D, Goddard A, Roy M, Ferrara N, Tumas D, 
Schwall R. NL4 Tie ligand homologue nucleic acid. Patent Number: 6,4137,770. Date of 
Patent: July 2, 2002. 

Ashkenazi A Fong S, Goddard A, Gurney AL, Napier MA, Tumas D, Wood Wl. Nucleic acid 
encoding A-33 related antigen poly peptides. Patent Number: 6,410,708. Date of Patent:: 
Jun. 25, 2002. 

Botstein DA Cohen RL, Goddard AD, Gurney AL, Hillan KJ, Lawrence DA, Levine AJ, 
Pennica D, Roy MA and Wood Wl. WISP polypeptides and nucleic acids encoding same. 
Patent Number: 6,387,657. Date of Patent: May 14, 2002. 

Goddard A, Godowski PJ and Gurney AL. Tie ligands. Patent Number: 6,372,491. Date of 
Patent: April 16, 2002. 

Godowski PJ, Gurney AL, Goddard A and Hillan K. TIE ligand homologue antibody. Patent 
Number: 6,350,450. Date of Patent: Feb. 26, 2002. 

Fong S Ferrara N Goddard A, Godowski PJ, Gurney AL, Hillan K and Williams PM. Tie 
receptor tyrosine kinase ligand homologues. Patent Number: 6,348,351. Date of Patent: 
Feb. 19, 2002. 

Goddard A, Godowski PJ and Gurney AL. Ligand homologues. Patent Number: 6,348,350. 
Date of Patent: Feb. 19, 2002. 

Attie KM Carlsson LMS, Gesundheit N and Goddard A. Treatment of partial growth 
hormone' insensitivity syndrome. Patent Number: 6,207,640. Date of Patent: March 27, 
2001. 

Fong S Ferrara N Goddard A, Godowski PJ, Gurney AL, Hillan K and Williams PM. Nucleic 
acids encoding NL-3. Patent Number: 6,074,873. Date of Patent: June 13, 2000 

Attie K Carlsson LMS, Gesunheit N and Goddard A. Treatment of partial growth hormone 
insensitivity syndrome. Patent Number: 5,824,642. Date of Patent: October 20, 1998 

Attie K Carlsson LMS, Gesunheit N and Goddard A. Treatment of partial growth hormone 
insensitivity syndrome. Patent Number: 5,646,1 13. Date of Patent: July 8, 1997 

Multiple additional provisional applications filed 
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PUBLICATIONS 

Seshasayee D, Dowd P, Gu Q, Erickson S, Goddard AD Comparative sequence analysis of 
the HER2 locus in mouse and man. Manuscript in preparation. 

Abuzzahab MJ, Goddard A, Grigorescu F, Lautier C, Smith RJ and Chernausek SD. Human 
IGF-1 receptor mutations resulting in pre- and post-natal growth retardation. Manuscript in 
preparation. 

Aggarwal S, Xie, M-H, Foster J, Frantz G, Stinson J, Corpuz RT, Simmons L, Hillan K, 
Yansura DG, Vandlen RL, Goddard AD and Gurney AL. FHFR, a novel receptor for the 
fibroblast growth factors. Manuscript submitted. 

Adams SH, Chui C, Schilbach SL, Yu XX, Goddard AD, Grimaldi JC, Lee J, Dowd P, Colman 
S., Lewin DA. (2001) BFIT, a unique acyl-CoA thioesterase induced in thermogenic brown 
adipose tissue: Cloning, organization of the human gene, and assessment of a potential link 
to obesity. Biochemical Journal 360: 1 35-1 42. 

Lee J Ho WH. Maruoka M. Corpuz RT. Baldwin DT. Foster JS. Goddard AD. Yansura DG. 
Vandlen RL. Wood Wl. Gurney AL. (2001) IL-17E, a novel proinflammatory ligand for the IL- 
17 receptor homolog IL-17RM. Journal of Biological Chemistry 276(2): 1660-1664. 

Xie M-H, Aggarwal S, Ho W-H, Foster J, Zhang Z, Stinson J, Wood Wl, Goddard AD and 
Gurney AL. (2000) Interleukin (IL)-22, a novel human cytokine that signals through the 
interferon-receptor related proteins CRF2-4 and IL-22R. Journal of Biological Chemistry 275: 
31335-31339. 

Weiss GA, Watanabe CK, Zhong A, Goddard A and Sidhu SS. (2000) Rapid mapping of 
protein functional epitopes by combinatorial alanine scanning. Proc. Natl. Acad. Sci. USA 97: 
8950-8954. 

Guo S, Yamaguchi Y, Schilbach S, Wada T.;Lee J, Goddard A, French D , Handa H, 
Rosenthal A. (2000) A regulator of transcriptional elongation controls vertebrate neuronal 
development. Nature 408: 366-369. 

Yan M, Wang L-C, Hymowitz SG, Schilbach S, Lee J, Goddard A, de Vos AM, Gao WQ, Dixit 
VM. (2000) Two-amino acid molecular switch in an epithelial morphogen that regulates 
binding to two distinct receptors. Science 290: 523-527. 

Sehl PD, Tai JTN, Hillan KJ, Brown LA, Goddard A, Yang R, Jin H and Lowe DG. (2000) 
Application of cDNA microarrays in determining molecular phenotype in cardiac growth, 
development, and response to injury. Circulation 101: 1990-1999. 

Guo S, Brush J, Teraoka H, Goddard A, Wilson SW, Mullins MC and Rosenthal A. (1999) 
Development of noradrenergic neurons in the zebrafish hindbrain requires BMP, FGF8, and 
the homeodomain protein soulless/Phox2A. Neuron 24: 555-566. 

Stone D, Murone, M, Luoh, S, Ye W, Armanini P, Gurney A, Phillips HS, Brush, J, Goddard 
A de Sa'uvage FJ and Rosenthal A. (1999) Characterization of the human suppressor of 
fused; a negative regulator of the zinc-finger transcription factor Gli. J. Cell Sci. 112: 4437- 
4448.' 

Xie M-H Holcomb I, Deuel B, Dowd P, Huang A, Vagts A, Foster J, Liang J, Brush J, Gu Q, 
Hillan K, Goddard A and Gurney, A.L. (1999) FGF-19, a novel fibroblast growth factor with 
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We have enhanced the polymerase chain 
reaction (PGR) such that specific DNA 
sequences can be detected without open- 
ing the reaction tube. This enhancement 
requires the addition of ethidinm bromide 
EtBr) to a PCR. Since the fluorescence of 
EtBr increases in the presence of double- 
stranded (ds) DNA an increase in fluores- 
cence in such a PGR indicates a positive 
amplification, which can be easily moni- 
tored externally. In feet, amplification can 
be continuously monitored in order to 
follow its progress. The ability to simulta- 
neously amplify specific DNA sequences 
and detect the product of the amplification 
both simplifies and improves PCR and 
may facilitate its automation and more 
widespread use in the clinic or in other 
situations requiring high sample through- 
put 

Although the potential benefits of PCR 1 w cUw- 
Z TdVwrks «re wen kiiowrv*- 5 , it is sutt n* 
widely used in this setting, even though U »s 
f w r year* smco thcrroTrtaU* DNA ptfy™**- 
W made PCR practical. Some of the reasons for its slow, 
acceptance are high cost, lack of automation of pre, and 
p^PCR procca&g steps, and fake posibve results frorn 
SrVrovcT-iontaminttion. The first two points arc related 
S labor is «he largest contributor to cost at the present 
of K* development. Most Current assays requ«e 
some form of "downstrW processing once 
ding is done in order tt> determine whether the rargct 
DNA seauence was present and has amplified, inese 

without use of restriction digestion™ HPLC 9 , or capdfcuy 
electrophoresis". These methods arc labc^mtense, have, 
low throughput, and arc difficult to automate. The third 
point is ale Tetany related to downstream process ing. 
The handling of the PCR product w these dowwtrcam 
processes increases the chances that amplified DNA ynll . 
spread through the typing lab, resulting m-a risk of 



"carryover" false positives in subsequent testing . 

These downstream processing steps would be eUn«- 
rated if specific amplification and detection of ampbficd 
DNA took place simultaneously within an unopened re- 
action vessel Assays m which such different processes tafce 
place without the need to separate reaction components 
have been termed "homogeneous'. No truly fc°mogc-.. 
tieous PCR assay has been demonstrated to cHtte^atohough 
progress towards this end has been reported, C3iehat>, « 
sj.™ developed a PCR product detection scheme using 
fluorescent primers that resulted in a fluorescent PGR 
product AU&c-specific primers, each with different Bw- 
-£**at tags, wJeuUd to indicate the jt-wj** 
DNA. However, the unincorporated primers tnust still oe 
removed in a downstream process m order ^ v »ualr« »he 
result R«)cnUy, HoUand, et al.'*, developed 
which the endogenous 5' ^nuclease assay of Taq DNA 
JEerase waVStploitcd to cleave a febeied^gonucko- 
Sde probe. The prtbe would only cfcave if PCR ampfift- 
cation had produced its compWntary 
order to detect the cleavage products, however, a subse- 
quent process w again needed. . 
4 We have developed a truly ^trrt^en TO m^y for PCR 
and PCR product detectioD based upon tbc ^fl^ 
leased fluorescence that ethidium bromide ^and other 
DNA binding dyes exhibit when they are bound ,to_o> 
DNA^ ,S . As outbned in Figure I, a prototype PCR 
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RQVBI l -Principle of simultaneous amplification and dctCcttOtt Of ., 
PCR product The component* ofa I^Rconwnh^ EtBr that arc 
^reWntarelined-lffiritsclf.EtBr bound toe^crssDNA«r 
dsDNA. There k * large fbiorescencc enhancement when ^tBr IS 
bound to DNA and finding ii jrcathr enhanced when DNAjS 
double-siTanded. After sumdeni. <n).. cydes of PGR, the .net 
increase in dsDNA results in additional EtBr binding, and S net 
incrense in total fluorescence: 
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FMlki 2 Gel electrophoresis of PCS. amplification products of the 
human, nudcar gene, HLA DQo, made in the presence of 
increasing amounts of EtBr (up to 6 N-g/tnl). The presence of 
£tBr l«s "no obvious effect on the yield or specificity of amplifi- 
cation. 




o 17 
\ \ 



21 
I 



25 29 
/ / 



2^9 
mole 




HGBK 3 (A) Fluorescence measurement* froth PCRs that contain 
0.5 u.g/m! EtBr and that are specific for Y-cJiroinosonoe repeat 
sequence*. Five rrplicatc PCRs were begun containing each of tbe 
DNAs specified. At leach indicated cycle, one of the five replicate 
PCRs for each DNA was removed from thermocyding and its 
fluorescence measured. Units of fluorcaccnc* are aitntrair. (£) 
UV photography of PCRtube* (0.5 ml Eppcndorf -style, polypro- 
pylene micjxxemiifuge tubes) containing reactions, those sta.«> 
ing from 2 ng male DNA and control reactions without any DNA, 
from (A). 



begins with primers that are single-stranded DNA (ss- 
DNA), dNTPs, and DNA polymerase; An amount of 
dsDNA containing the target sequence (target DNA) is 
also typically present. This amount can vary, depending 
on the application, from suisjle-ceU amounts of DNA 1T to 
micrograms per PCR" If EtBr is present, the reagents 
that will fluoresce, in order of increasing fluorescence, are 
free EtBr itself, and EtBr bound to the single-stranded 
DNA primers and to the double-stranded target DNA (by 
its intercalation between the stacked bases of the DNA 
doublc-hcfw). After the first denaturation cyde, target 
DNA will be largely single-stranded. After a FCR is 
completed, the most significant change is the increase in 
the amount of dsDNA (tbe PCR product itself) of up to 
several micrograms. Formerly free EtBr is bound to the 
additional cbDNA, resulting in an increase in fluores- 
cence. There is also some decrease in the amount of 
ssDNA primer, but because the binding of EtBr to ssDNA 
is much Jess than to dsDNA, the effect of this change on 
the total fluorescence of the sample is small. The fluores- 
cence increase can be measured by directing excitation 
illumination through the walls of the amplification, vessel 
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before and after, or even continuously during, therrnocy. 
ding. 

RESULTS 

PCR in the presence of EtBr. In order to' assess the 
affect of EtBr in PCR, amplifications of the human Hl^\ 
DQct gene 19 were performed with the dye present at 
concentrations from 0.06 to 8.0 p-g'ml (a typical concen- 
tration of EtBr used tn staining of nucteic actds following 
g«l electrophoresis is 0.5 u.g/mf). As shown in Figure 2, ge} 
electrophoresis revealed little or no djfierencc in the yield 
or quali ty of the amplification product whether EtBr was 
absent or present at any of these concentrations, indicat- 
ing that EtBr does not inhibit PCR. 

Detection of human Y -chromosome specific se- 
<jnenccs- Sequence -specific, fluorescence enhancement of 
EtBr as a result of PCR was demonstrated in a scries of 
amplifications containing 0.5 ug/ml EtBr and primers 
specific to repeat DNA sequences found on the human 
Y-chromosomc^. These PCRs initially contained either 
60 ng male, 60 ng female, 2 ng mak human or no DNA. 
Five replicate PCRs were begun for each DNA. After 0, 
17, 21 , 24 and 29 cycles of thermocyding, a FCR for each 
DNA was removed from the thermoeyder, and its fluo- 
rescence measured in a spratroflnorotneter and plotted 
vs. amplification cyde number (Fig. 3A). The shape of this 
curve reflects the fact that by the time an increase in 
fluorescence can be detected, the increase in DNA is 
becoming linear and not exponential with cyde number. 
As shown, the fluorescence increased about three-fold 
over the background fluorescence for the PCRs contain- 
ing human male DNA, but did not significantly increase 
for negative control PCRs, which contained cither no 
DNA or human female DNA. The more male DNA 
present to begin with— 60 ng versus 2 ng— die fewer 
cycle* were needed to give a detectable increase in fluo- 
rescence. Oel electrophoresis oo the products of these 
amplifications showed that DNA fragments of the ex- 
pected skc were made in the male. DNA containing 
reactions and that little DNA synthesis took place in the 
control samples. 

In addition, the increase to. fluorescence was visualized 
by simply laying the completed, unopened PCRs wiaUV 
transuhiminator and pnotographing them through a red 
filter. This is shown in figure SB lor the reactions that 
began with 2 ng male DNA and those with no DNA. 

Detection of specific alleles of the human fj-gtobin 
gene. In order to demonstrate that this approach has 
adequate spedfidty to allow genetic screening, a detection 
of the skkfe-ccll anemia mutation was performed. Figure 
4 shows the fluorescence from completed amplifications 

containing EtBr (O.S v-glad) a* detect** by photography 

of the reaction tubes on a UV transilluminator. These 
reactions were performed using primers specific for ei- 
ther the wad-type or sickle-cell mutation of the human 
B-globin gene*'. The specificity for each allele is imparted 
by placing the sickle-mutation site at the terminal 3' 
nucleotide of one primer. By using an appropriate primer 
annealing temperature, primer extension— and thus am- 
plification—can take place only if the 5' nucleotide of the 
primer is complementary to the B-globin alldc present^ . 

Each pair of amplifications shown in Figure 4 consists of 
a reaction with either the wiM-typc allele spedfic (left 
tube) or skklc-allele spedfic (right tube) primers. Three 
different DNAs were typed: DNA from a homozygous, 
wHd-typc B-globin individual (AA); from a heterozygous 
sickle B-globin individual (AS); and from a homozygous 
sickle B-globin individual (SS). Each DNA (50 ng genomic 
DNA to start each PGR) was analyzed m triplicate (3 pairs 
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f reactions each). The DMA .type wis reflected tn the 
laBve fluorescence intensities in each pair of completed 
nlifications. There was a significant increase in fluores- 
"^e only where a pVglobin allele DNA matched the 
^nicr set. When measured on. a spcrtroHuorometcr 
Pi t _ not shown), this fluorescence was about three times 
St present in a PCR where both p-globm alleles were 
«ibinatched to the primer set. Gel electrophoresis (not 
!Lwn) established that this increase in fluorescence was 
L t to the synthesis of nearly a microgram of a DNA 
Lmcnt of the expected size for p^lobin. There was 
litdt synthesis of dsDNA in reactions m which the allele- 
yysdfic primer was mismatched to both alleles. 
* ConrinwwM mowftoriog of a PGR, Using a fiber optic 
dcviecrH Is possible to direct excitation illumination from 
t spectrolluorometer to a PCR undergoing thcnnocycling 
and to return its fluorescence to the Kpectroftuommctcr. 
The fluorescence readout of such an arrangement, di- 
rected at an EtBr-containing amplification of Y-chromo- 
somc speci&c sequences from 25 ng of human male UNA, 
is shown in Figure 5. The readout from a control PCR 
whh no target DNA is also shown. Thirty cycles of PCR 
W crc monitored for each- , 

The fluorescence trace as a function of time dearly 
shows the effect of the meraocyding. Fluorescence lntcn- 
«v rises and. fells inversely with temperature The fluo- 
rescence intensity is minimum at the denaturauon lem- 
Dcrature (94°C) and maximum at the anneaUngtextension 
temperature (50°C). In the negative-control PCR, these 
fluorescence maxima and minima do not change sigmn- 
cantly over the thirty therraocyefcs, indicating that there is 
Bute dsDNA synthesis without the appropriate target 
DNA, and there is little if any Weachuwof EtBr during 
the continuous illumination Of the sample. 

In the PCR containing male DNA, the fluorescence 
maxima at the annealing/extension temperature begin to 
increase at about 4000 second* of thennocychng, and 
continue to increase with time, indicating that dsDNA is 
bcinc produced at a detectable level. Note that the fluo- 
rescence minima at the denatutation temperature do not 
significantly increase, presumably because at this temper- 
*Sm there is no drtwfc for EtBr to bind- Thus the course 
of the amplification is followed by tracking the fluorc* 
cence increase at the atwcaHng temperature. Analysis of 
the products of these two amplifications by gel cVcetropho- 
rwis showed * DNA fragment of the expected size for the 
male DNA containing sample and no detectable DNA 
synthesis for the control sampte. 

DISCUSSION , • 

Downstream processes such as rrybnd.Tauon to a se- 
Qucnce^pecific probe can enhance die specificity of DNA 
deceviiv.. PCR. The cUmiwrtkm of these processes 
means that' the specificity of this homogeneous w»y 
depends solely on mat of rCR. In the case of fckle^U 
disease, we have shown that PGR alone has sufficient DNA 
sequence Bpecificity to permit genetic screening. Using 
appropriate amplification conditions, there is hide oon- 
specinc production of dsDNA in the absence of the 

appropriate target allele. 

The -specificity required to detect pathogens can be 
more or less than that required' to do genetic screening, 
depcodinff on the number of pathogens in the sample and 
the amount of other DNA that must be taken with the 
sample. A difficult target is HIV, which requires detection 
of a viral genome that can be at the level of a few copies 
per thousands of host cells*. Compared with generic 
screening, which is performed on cells containing at least 
one copy of die target sequence. HiV detection requires 
both more specificity and the input of more Brf«H 
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RWS 4 UV phewgraphy of PCR tubes ccrataininB wnpBficwrons 
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the mnran p-globin gene. The left of each pair of tubes contains 
aBelospcdfic primer* to the wild-type aDeks. the right tube 
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cycles of PCR, and the input DNAs and the alleles they conUm 
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was done in trijScatc (3 pair* of PCRs) for each mpui DNA: 
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mOSS Continuous, rcal^e monitor^ of a PCKAfib^ opw 
was oscd to carry, exntation hght to a 

emiited iieht back to a fluoromctcr (see Experimental Protocol). 

5Wtwig with 20 tlR of human male DNA Oop^ _or « , ». Wlrf 
PCR without DNA. (bottom), were, monitored. Thuw cydes of 
PCR were folJowed for each. Tlx temperature cycled between 
94*C (denatUMtion) and 50*0 (anncaliog and extension). Note m 
5k male DNA PCR, . the cycle (time) dependent maeasc in 
fluorescence at the anneaHng/exteDsion temperature 
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DNA— up to microgram amounts— in order to have suf- 
ficient numbers of target sequences. This large amount of 
starting DMA m an amplification signi&cinUy increases 
the background fluorescence over which any additional 
fluorescence produced by PCR must be detected. An 
additional compKcatiori that occurs with targets io. low 
copy-number is the formation of the "primet-duner" 
artifact. This is the result of the extension of one primer 
using the other primer as a template. Although this occurs 
infrequently, once it occurs the extension product is a 
substrate for PCR amplification, and can compete with 
true PCR targets if those targets are rare. The primtr- 
dimcr product is of course dsDNA and thus is a potential 
source of false signal in this homogeneous assay. 

To increase FCR. specificity ana reduce the effect of 
jrimer-dimcT amplification, we are investigating a num- 
ber of approaches, including the use of nested-primer 
amplifications that take place in a single tube 8 , and the 
"hot-start", in which nonspecific amplification i* reduced 
by raising the temperature of the reaction before DNA 
synthesis begins**. Preliminary results using these ap- 
proaches suggest that nrhner-dtrncT is effectively reduced 
and it is possible to detect the increase in EtBr fluores- 
cence in a PCR instigated by a single HIV genome in a 
background of 10* ceils. With larger numbers of ccfls, the 
background fluorescence contributed by genomic DNA 
becomes problematic. To reduce this background, it may 
be possible to use sequence-specific DNA-bmding dyes 
that can be made to preferentially bind PCR product over 
genomic DNA by incorporating the dye-binding DNA 
sequence into the PCR product through a 5' "add-on" to 
the oligonucleotide prirocr !M . 

We nave shown that the detection of fluorescence 
generated by an EtBr-containing PCR is straightforward, 
both once PCR is completed and continuously during 
thermocycling. The ease with which automation of spe- 
cific DNA detection can be accomplished is the most 
promising aspect of this assay. The fluorescence analysis 
of completed PCRs js already possible with existing instru- 
mentation ha 96-well format**. In this format, the fluores- 
cence in each PCR can be quantitated before, after, and 
even at selected points du«ng theriuocyciing by moving 
the rack of PCRs to a 96-microwc)l plate fluorescence 
reader". 

The instrumentation necessary to continuously monitor 
multiple PCRs simultaneously is also simple in principle. 
A direct extension of the apparatus used here is to have 
multiple fiberoptics transmit the excitation light and flu- 
orescent emissions to and from multiple PCRs. The ability 
to monitor multiple PCRs continuously may allow quan- 
titation of target DNA copy number. Figure 5 shows that 
the larger the amount of starting target DNA. the sooner 
during Pf s .R a fluorescence increase is detected. Prdinii- 
nary experiments <Wiguchi and Zollinger, manuscript in 
preparation) with continuous monitoring have shown a 
sensitivity to two-fold differences in initial target DNA 
concentration. 

Conversely, if the number of target molecules is 

known as it can be in genetic screening-rcontinuous 

monitoring may provide a means of detecting false posi- 
tive and false negative result*. With a known number of 
target molecules, a true positive would exhibit detectable 
fluorescence by a predictable number of cycles of PCR. 
Increases in fluorescence detected before or after that 
cycle would indicate potential artifacts. False negative 
results due to, for example,. inhibition of DNA polymer- 
ase, may be detected by including within each PCR an 
inefficiently amplifying marker. This marker Tcsults in a 
fluorescence increase only after a large number of cy- 
cles — many more than are necessary to detect, a true 
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positive. If a sample fails to have a fluorescence increase 
after this many cycles, inhibition may be suspected. Since, 
in this assay, conclusions are drawn based on the presence 
or absence of* fluorescence signal alone, such controls may 
be important. In any event before any test based On this 
principle is ready for the clinic, an assessment of its false 
positiveftalse negative rates will need to be obtained using 
a large number of known samples. 

In summary, the inclusion in PCR of dyes whose fluo- 
rescence is enhanced upon binding dsDNA makes it 
possible to detect specific DNA amplification from outside 
he PCR tube. In the future, instruments based upon this 
principle may facilitate the more widespread use of PCR 
n applications that demand the high throughput of 
samples. 

EXPERIMENTAL PROTOCOL , , 

Hcuuan HLA-DQ<* gem* ttnpHBcukma containing EtBr. 
PCRs were set up in 100 (J volumes containing 10 mM Tris^HCl, 
pH 8.3; 50 mM KC1; 4 mM MgC^: t3 unit* of Too DNA 
polymerase (Perliiii-Ehwcr Ccnis. Norwalk. CT); 20 oriole each 
of human HlA-DQa • gene specific oligonucleotide primers 
GH26 and CH27 19 and approjaraateljr 1<F copies of DQ& PCR 
product diluted from a previous reaction. Echidium bromide 
(Ei Br; SignvO was used at the concentrations indicated id Figure 
2. Thermocydine proceeded for 80 cvtles ift a model 4*0 
ihcrmocydcr <Ferkin-EI«er Ccw*, Norwalk, CT) using a "stcp- 
cyclc" program of 94*C for 1 mm. denaturauon and wry. ror 30 
sec. anncaW and 72°C for 30 sec. eiwnstoc. 

Y-chromoSrmc specific PCR. PCRs (100 ul total reacuon 
volume) containing U5 jifi/ml EtBr were prepared a* described 
for HLA-DQa. except with different primers and target DMAs. 
These PCRs contained I 5 prooh: each male DNA-speetfjc pitmen 
VI. 1 and Y1.2 40 , and cither 60 ng male, 60 og female, 8 ng mate, 
or I 
for 



oo human DNA. Thermocycling was MVfor 1 nun. and 60=0 
tor 1 min using a "step-cycle* program. The number of cycles for 
a sample were as indicated in Figuie 3. Fluorescence measure- 
ment is described below. 

Mlck-spccific, human p-gtobin gw PCR. Ampuficauons of 
100 id volume' usinR 0 5 jt&m\ of £tBr were prepared as 
described for HLA-DQa above except with different pnmcn and 
tarcct DNAb. These PCRs contained either, primer pur HOPS/ 
HO MA <wiM-type globin sped8c primers) or MGP2/Ilpl4S (sk*- 
lc-riobin specific primers) at 10 pmole »ch pruncr per PCR 
Tfese primers were developed by Wu ct aL 21 . Three different 
tacgei nNA.t were used in separate amplifications*— 60 ng each ot 
human DNA that was homozygous for the sfcMc trait (5S). DNA 
that was heterozygous for the sickle watt (AS), or DNA that was 
homozygous for thew.l- elobin (AA). Thcrmocycfing was few 30 
cycles at 94"C for 1 min. and 55'C for 1 min. itsuwa " stc ^* 
p^gram. An annealmg temperature of 55^ fcul^cn .dmwn by 
VVo ei al. 21 w provide allcfe-spcofk a^ljtouon. <*™f>$™ 
PCRs were photographed through a red fiter <Wratten_ 23A) 
after placing P the reaction tube* amp a model TM-96 transfflumi- 
nator <OV-productS San Cabriel CA). 

JcTuonssocnce measiirement. Jluoresceooe "^rcraCT^wcK 
made on PCRs containing EtBr in a FIuctoIob-2 BUoromettr 
(SPEX. Edison. NJ). Excitation was at the 600 nrn baruJ * | tt> 
S»uV2 nm bandwidth with »GG45S "^'^^^S 
Crist. Inc.. Irvine. CA) to exclude scamd-order tont«u 
y R hl was detected at 570 nm with a bandwidth of about 7 nm. An 
<X5 530 »m cut-off filter Was used to remove the otcrtauon bgro. 

ContitHwrn ftoowscence mordtormg of FCR. JpOWnuW* 
monitoring of a PCR in progress ; was accomplished u«n« & C 
6 peca-oflu5rometcr and settings descrrbod above as weU » a 
fiberoptic accessory (SPEX cat. no. 1950) -joboAr send «o»^ 
fight to. and receive emitted Ughf from, a PCR pW mawcUoi 
a model -M0 memmcyclcr (Pcrkro-Elmer Cetus). The probe end 
of the fiberoptic cable was attached vhh "5 mmutc-cpoxy to tw 
open top of a PGR tube (a 0.5 ml polypropytaic centime WW 
wk itsVap rr moved) effectively seeing a. The exposed top ^ 
the PCR tube and the end of the fiberopuc cable were shielded 
from room light and the room lights were kept dimmed during 
each run. T»k monitored FCR was an ampWrnuon of y-c±TO- 
mosomMpcdfk repeat seovences w dc^brf ab^e. racept 
using an anncaKng/extensiow temperauirc of 50°C. The racoa" 
was covered wi* mii>ertl oil (2 dropn) to prevent cvapora&on- 
Tbsmocydiiig and fluorescence measurement were stalled si- 
multaneously A time-base son wit»i a 10 second mtegratioii mnc 
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aaa used and the emission signs] was ratioed to tbr excitation 
nico.'O to control for elw»jre* in li^ht-sourcc intensity. Daw .were 
Reeled using the droSOOOf, version 2.5 (SPEX) data system. 
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IMMUNO BIOLOGICAL LABORATORIES 



SCD-14EUSA 

Trauma, Shock and Sepsis 




The CD-14 molecule is expressed on the surface of 
monocytes and some macrophages. Membrane- 
bound CD-14 is a receptor for lipoporysaccharide 
(LPS) complexed to LPS-Binding-Protein (IBP). The 
concentration of its soluble form is altered under 
certain pathological conditions. There is evidence for 
an important role of sCD-14.with polytrauma, sepsis, 
burnings and inflammations. 
During septic conditions and acute infections il seems 
to be a prognostic marker and is therefore of value in 
monitoring these patients. 



IBL offers an ELISA for quantitative determination of 

soluble CD-14 in human serum, -piasma, cell-culture 

supernatants and other biological fluids. 

Assay features: 12x8 determinations 
(microliter strips), 
precoated with a specific 
iTKinoctonal antibody, 
2x1 hour incubation, 
standard range: 3-96 ng/ml 
detection limit: 1 ng/ml 
CV: intra- and interassay < 8% 



For more information call or fax 
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SIMULTANEOUS AMPLIFICATION AMD DETECTION t 
SPECIFIC DMA SEQUEHCES 

Hifftichi* Gavin Dolliufrer 1 , P« Sean Walsh and Robert Griffith 

9460B. ♦Corresponding author, 



We have enhanced the polymerase chain 
reaction (PCR) such that specific DNA 
sequences can be detected without open- 
ing the reaction tube. This enhancement 
requires the addition of ethidium bromide 
(EtBr) to a PCR. Since the fluorescence of 
EtBr increases in the presence of double- 
stranded (ds) DNA an increase in fluores- 
cence in such a PGR indicates a positive 
amplification, which can be easily moni- 
tored externally. In fact, amplification can 
be continuously monitored in order to 
follow its progress. The ability to sunulta- 
neonsly amplify specific DNA sequences 
and detect the product of the amplification 
both simplifies and improves PCR and 
may facilitate its automation and more 
widespread use in the clinic or in other 
situations requiring high sample through- 
put 

Although the potential benefits of PCR 1 to cliu- 
£d I&gnoX arc wett kuowir*- s . it to sun not 
widely used in this setting, everi though « b 
four year* eiuco thcrnvj*»bl* DNA potym*«-- 
«w* made PCR practical. Some of the reasons for »ts slow, 
acceptance are high cost, lack of automation of pre, and 
post-PCR processing steps, and false positive results from 
Snyo VCT <on«minition\ The first two points are related 
in that labor is the largest contributor to cost at the present 
wage of PCR development. Most Current assays require 
so™ form of "downstrearn" processing once *f™£ 
ding is done in order to determine whether the ^rgct 
UNA sequence was present and has amplified. These 
tadSdTl& A hybridation^, gel ekc^pboreMS with or 
without use of restriction digestion''* HPLC 9 , or cipdbxy 
electrophoresis". These medtods are Immense, ( fa»e 
low throughput, and arc difficult to automate. The third 
point is abo cWy related to downstream processing. 
The handling of the PCS. product .n these dow^trc^m 
processes increases the chances that amplified DNA 
spread through the typing lab, resulting m a lis* ol 



"carryover™ false positives in subsequent testing . 

These downstream processing steps would be eUtrH- 
nated if speei6e amplification and detection of amphfied 
DNA took place simultaneously within an unopened re- 
action vessel. Assays m which such different processes take 
place without the need to separate reaction components 
have been termed homogeneous". No truly homoge- 
neous PCR assay has been demonstrated to date^atehough 
DTOCTCSs towards this end has been reported . ChenaD, et 
al. 1 * developed a PCR product detection scheme using 
fluorescent primers that resulted in a fluorescent PCR 
product Alldc-specific primers, each »ith different fluo- 
rescent tags, were used to indicate the genotyp? of the 
DNA. However, the unincorporated primers must still oe 
removed in a do wnstream process in order « vouate the 
result Recently, Holland, et developed 
Which the endogenous 5' cxdnudease assay of T«l DNA 
polymerase was exploited to cleave a labeled ^gonucieo- 
ude probe. The probe would only cfcave if PCR ampftU 
cation had produced its comrJementaiy ^^J? 
order to detect the dcavage products, however, a subse- 
quent process is again needed. ^ 
H We have developed a truly homogeneous aasay for PCR 
and PCR product detection based upon the gready in- 
creased fluWTna: that ethidium bromide ^and other 
DNA binding dyes exhibit when they are bound to_ds- 
DNA l «- 16 . As outlined in Figure 1, a prototypK PCR 
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nCQBE I Principle of jimultancoua amplification and detection of 
PCR product The component of a PCR coOttinh^EiBr that arc 
fltiorwentarefined-EtBritself, EtBr bouraJ toe^ssDNAor 
daDNA. There is a large fiuorcsccncc enhancement when iUSr is 
bound to DNA and fimding is greatly enhanced when DNA 
double-stTanded. After suffidcni <n). cydes of PCR. the .net 
mnrcaM in daDNA results in additional EtBr binding, and » net 
increase in total fluorescence: 
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FRVBt 2 Gel electrophoresis of PCR am plifiaiion prod wets of the 
human, mtdcar gene, HLA DQd, made in the present*: of 
increasing amounts of EtSr (up to 8 M-g/ml). The presence of 
£tBr has no obvious effect on the yield or specificity of amplifi- 
cation. 
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TIGOK J (A) Fluorescence mcasurcmcnui from PCRs that contain 
0.5 w^nl ElBr and that are specific for Y-ch*om<woiDc repeat 
«floencet. Five replicate PCRs were begun containing each of the 
DNA* specified. At each indicated cyde, one of the five replicate 
PCRs for cadi DNA -was removed from thennocyding and its 
fluorescence measured. Unit* of fluorescence art arbitrary. (B) 
UV photography of PGR tube* (0.5 ml Eppcndoif-ityk, polypro- 
pylene micro«Titrifuec tubes) containing reactions, those start* 
rag from 2 ng male DNA and control reactions without any DNA, 
from (A). 



begins with primers that are single-stranded DNA (ss 
DNA), dNTPs, aiwi DNA polymerase; An amount of 
dsDNA containing the target sequence (target DNA) is 
also typically present. This amount can vary, depending 
on the application, from single-cell amounts of DNA to 
micrograms per PCR^ 8 . If EtBr is present, the reagents 
that will fluoresce, in order of increasing fluorescence, are 
free EtBr hsclf, and EtBr bound to the single-stranded 
DNA primers and to the double-stranded target DNA (by 
its intercalation between the stacked bases of the DNA 
doublc-hcfis). After the first denaturation cyde, target 
DNA will be largely single-stranded. After a PCR is 
completed, die most significant change is the increase in 
the amount of dsDNA (the PCR product itself) of up to 
several micrograms. Formerly free EtBr is bound to the 
additional dsDNA., resulting in an increase in fluores- 
cence. There is also some decrease in the amount of 
ssDNA primer, but because the binding of EtBr to ssDNA 
is much less than to dsDNA, the effect of this change on 
the total fluorescericc of the sample is small. The fluores- 
cence increase can be measured by directing excitation 
illumination through the walls of the amplification: vessel 
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before and after, or even continuously during, thermocy- 
ding. 

RESULTS 

PCR in the presence of EtBr. order to assets the 
affect of EtBr to PCR, amplifications of the human Hlj< 
DQct gene** were performed with the dye present at 
concentrations from 0.06 to 8.0 v.gfral (a tynieaj concen- 
tration of EtBr used in staining of nucleic acids following 
g«t electrophoresis is 0.5 u.g/ml). As shown in Figure 2, gel : 
electrophoresis revealed little or no difference in the yield 
or quality of the amplification product whether EtBr was 
absent or present at any of these concentrations, indicat- 
ing that EtBr does not inhibit PCR. 

Detection of human Y-chromosorac specific se- 
quences- Sequence-specific, fluorescence enhancement of 
EtBr as a result of PCR was demonstrated in a scries of 
amplifications containing 0.5 u.g/ml EtBr and primers 
specific to repeat DNA sequences found on the human 
Y-chromosomc^- These PCRs initially contained cither 
60 ng male, 60 ng female, 2 ng roak human or no DNA. 
Five replicate PCRs were begun for each DNA. After 9, 
17, 21 , 24 and 29 cycles of thermocyding, a PCR for each 
DNA was removed from the thermoeyder, and its. fluo- 
rescence measured in a spoctroflnorometer and plotted 
vs. amplification cyde number (Fig. 3A). The shape of ihw 
curve reflects the fact that by the time an increase in 
fluorescence can be detected, the increase in DNA is 
becoming linear and not exponential with cyde number. 
As shown, the fluorescence increased about three-fold 
over the background fluorescence for the PCRs contain- 
ing human male DNA, but did not significantiy increase 
for negative control PCRs, which contained either no 
DNA or human female DNA. The more male DNA 
present to begin with— 60 ng versus 2 ng— the fewer 
cycles were needed to give a detectable increase m fluo- 
rescence. Gel electrophoresis oo the products of these 
amplifications showed that DNA fragments of the ex- 
pected size were made in the male DNA containmg 
reactions and that little DN A synthesis took place in the 
control samples. 

In addition, the increase in. fluorescence was visualized 
by simply laying the completed, unopened PCRs on a UV 
transilhiminatOT and photographing them through a red 
filter. This is shown in figure SB tor the reactions that 
began with 2 ng male DNA and those with no DNA. 

Detection of spcafic alleles of the human fJ-globm 
gene. In order to demonstrate that this approach has 
adequate spedfidty to allow genetic screening, a dttcction 
of the sjclde-cdl anemia mutation was performed. Figure 
4 shows the fluorescence from completed amplification* 

containing EtBr (O.S |ig/««I) as dcittteS by photography 

of the reaction tubes on a UV transiUuminator. These 
reactions were performed using primer* specific for ci- 
ther the wild-type or skkk-cell mutation of the human 
B-globin gene*'. The spedfidty for each allele is imparted 
by placing the sickle-mutation site at the terraina) 3 
nucleotide of one primer. By using an appropriate primer 
annealing temperature, primer extension— and thus arn- 
plifK^tion— can take place only if the 3' nucleotide of Uv: 
primer is complementary to the p-gobro allele present • . 

Each pair of ampKncauons shown in Figure 4 consists ol 
a reaction with either the wiW-typc allele spedfie Ocft 
tube) or skklc-allele spedfie (right tube) primers. Three 
different DNAs were typed: DNA from a homozygous, 
wild-type B-globin individual (AA); from a heterozygous 
sickle fi-globin individual (AS); and from a homozygous 
sickle B-globin individual (SS). Each DNA (50 ng genomic 
DNA to start each PGR) was analyzed m triplicate (S pap 




PAGE 3/6 * RCVD AT 7/19/2004 3:10:03 PM pacific Daylight Time) " SVR:SVCS01/0 * DNIS:6638 * CSID:650 952 9881 * DURATION (mm-ss):04-46 





le yield 
Bv was 
ndieat- 

Sc se- 
nem of 
:rics or 
frimets 
human 
cither 
>DNA. 
Jtct 0. 
jr each 

Cj flUD- 

pkjttcd 
: of this 
ease in 
JNA is 
umber, 
cc-fotd 
ontafai- 
n crease 
her no 
i DNA 
! fewer 
in Ruo- 
f these 
the ex- 
.cad ning 
: in the 

.ualizcd 
naUV 
h a red 
•as that 
«JA. 

.-gtotrin 
ich has 
etccudn 
. Figure 
icadons 
>gr»phy 
. These 
: for cv- 
human 
npartcd 
iiu*J 3' 
: primer 
hus aro- 
c of the 
em"-** 
nsisu of 
Ifie (left 
Three 
>zygousi 
ozygous 
ozygous 
fcrjoraic 
"(3 pain 



JUL-19-E004 13:51 FROM : GENENTF^KGRL 650 95S 9881 



„ f rcartions each). The DNA type was reflected m the 
lative fluorescence intensities in each pair of completed 
„„li£catioiis. There was a significant increase in fluores- 
*£L only where a p-globin allele DNA matched the 
nrinicr net. When measured on a spectrofluorometer 
&ata not shown), this fluorescence was about three times 
Sit Present in a PCR where both p-gbUa alkies were 
mbitiatchcd to the primer set. Oct ckonjphore** (not 
£o«m) established that this increase in fluorescence was 
A«e to the synthesis of nearly a microgram of a DNA 
rLmcnt of the expected size for fJ^lobin. There was 
itdc synthesis of dsDNA in reactions m. which the allele- 
^edfic primer was mismatched to both alleles, 
'continuous moritaribog of a PGR, Using a fiber optic 
device H i* possible to direct excitation illumination from 
, q^ctrofluorometer to a PCR undergoing thcrrnocycling 
and to return its fluorescence to the spectroBuororoeter. 
The fluorescence readout of such an arrangement, di- 
rected at an EtBr-containing amplification of Y-chromo- 
somc spcd6c sequences from 25 ng of human male DN/^ 
h shown in Figure 5. The readout from a contrd PCR 
yfth no target DNA is also shown. Thirty cycles of PGR 
were monitored for each- , 

The fluorescence trace as a function of time clearly 
-hows the effect of the thermocyding. Fluorescence lnicn- 
«tv rises and falls inversely with temperature. The fluo- 
rescence intensity is minimum at the denaturation tera- 
oerature (94°C) and roajdrhum at the anncafcg/extcnsion 
Kmpcrawre (SOX). In the negative-control PCR, these 
fluorescence maxima and minima do not change signtn- 
csmtly over the thirty thcrmocycks, indicating that there is 
tittle dsDNA synthesis without the appropriate target 
DNA, and there is little if any Weacmwgof EtBr during 
the continuous illumination of the sample. 

In the PCR containing male DNA, the fluorescence 
maxima at the annealing/extension temperature begin to 
increase at about 4000 seconds- of tbenr^cychng, and 
continue to increase with time, indicating that dsDNA is 
being produced at a detectable level. Note that the fivo- 
rescexjee minima at the denaturation temperature do not 
significantly increase, presumably because at this temper- 
amre there is no dsDNA for EtBr to ^.7^.^ 
of the amplification is followed by tracking the fluores^ 
cence increase at the annealing temperature. Analysis of 
ihc products of these two amplifications by g«lcl<*trophc- 
rcrfs showed * DNA fragment of the <*P*f^/°I*f 
male DNA containing sample and no detectable DNA 
synthesis for the control sampte. 

DISCUSSION . ,. . 

Downstream processes such a* hybnd.xat,on w > a se- 
oucnce-apedfic probe can enhance die specificity of DM A 
d«cv:u W ub> PGR. The cbibuv.rion ti«*c procewca. 
means that' the specificity of this homogeneous assay 
depends solely on mat of PCR. In the case 
di£*se, we have shown that PGR alone has sufficient DNA 
sequence spedficuy to permit genede screening Using 
appropriate amplification conditions, there is little non- 
specific production of dsDNA in the absence of me 

,P ^ P 'p^fici^ t required co detect pathogens can be 
more or less than that required to do genetic scrcranng, 
depending on the number of pathogens in the sampte and 
the amount of other DNA that must be taken with the 
sample. A difficult target is HIV, which letnuTes detoetm 
of a viral genome that can be at the level of a Jew copies 
per thousands of host cells*. Compared with geneac 
screcnine, which is performed on cells containing at least 
one copy of the target sequence. HIV detection requires 
both more specificity and the input of more 
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neOK S ContittBOUS. rcaWme monitoring of a PCR. A fiber optic 

S^^S^dtatkm light to a P|R to 

emhttd Ueht back to a flooromcter (see Exoentncntal FK*ood). 

Zpfficalon using human ™^™fkjtX n £7, 
swung with 20 ng of human male DNA {^W _ n i control 
PCT^thout DNA. (boltfim), were roonnwrd. Thmvc^rjoT 
PCR were foJWed for each. The temperature cycled between 
Swratioa) and (annealing and extension). Now m 
£h Tmk PCR, the cycle (dine) depended mtrease », 
fluorescence at ttxc annealiog/exteiisMin temperature. 
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t>N a up to microgram amounts-— in order to have suf- ' 

ficicnt numbers of target sequences. This large amount of 
starting DNA in an amplification signiScantly increases 
the background fluorescence over which any additional 
fluorescence produced by PCR must be detected. An 
additional complication that occurs with targets in row 
copy-number is the formation of the "prinaer-dimer" 
artifact. This is the result of the extension of one primer 
using the other primer » a template. Although this occurs 
infrequently, once it occurs the extension product is a 
substrate for PCR amplification, and can compete with 
true PCR targets if those targets are rare. The primer- 
dimcr product is of course dsDNA and thus is a potential 
source of false signal in this homogeneous assay. 

To increase FCR specificity and reduce the effect of 
primer-dimcr amplification, we are investigating a num- 
ber of approaches, including the use of nested-primer 
amplifications that take place in a single tube 8 , and the 
"hot-start", in which nonspecific amplification i* reduced 
by raising the temperature of the reaction before DNA 
synthesis begins 8 *. Prdhninary results using these ap- 
proaches suggest tbatprinricr-dihjCT is effectively reduced 
and it is possible to delect the increase in EtBr fluores- 
cence in a PCR instigated by a single HIV genome in a 
background of 10* ceils. With larger number* of cells, the 
background fluorescence contributed by genomic DNA 
becomes problematic. To reduce this background, it may 
be possible to use sequence -specific DNA-binding dyes 
that can be made to preferentially bind PCR product over 
genomic DNA by incorporating the dye-binding DNA 
sequence into the FCR product through a 5' "add-on" to 
the oligonwclcotidc primer 8 ' 1 . 

We have shown that the detection of fluorescence 
generated by an EtBr-containing PCR is straightforward, 
both once PCR is completed and continuously during 
thermocycHng. The ease with which automation of spe- 
cific DNA detection can be accomplished is the most 
promising aspect of this assay. The Huoreseenee analysis 
of completed PCRs is alrcadyjpossiblc with existing instru- 
mentation in 96-well format**. In this format, the fluores- 
cence in each PCR can be qvantitated before, after, and 
even at selected points during thermocyc'nng by moving 
the rack of PCRs to a Qf^microwcll plate fluorescence 

reader 46 . . 

The instrumentation necessary to continuously monitor 
multiple PCRs simultaneously is also simple in principle. 
A direct extension of the apparatus used here is to have 
multiple fiberoprics transmit the cj&citation light and flu- 
orescent emissions to and from multiple PCRs. The ability 
to monitor multiple PCRs continuously may allow quan- 
titation of target DNA copy number. Figure S shows that 
the larger the amount of starting target DNA, the sooner 
during PCR a fluorescence increase is detected. Prelimi- 
nary experiments <Wiguchi and Dollinger, manuscript in 
preparation) with continuous monitoring have shown a 
sensitivity to two-fold differences in initial target DNA 
concentration. 

Conversely, if the number of target molecules 
known— as it can be in genetic screening-reontinuous 
monitoring may provide a means pf detecting false posi- 
tive and false negative results. With a known number of 
target molecules, a true positive would exhibit detectable 
fluorescence by a predictable number of cycles of PCR. 
increases in fluorescence detected before or after, that 
cycle would indicate potential artifacts. False negative 
resute due to, for example,. inhibition of DNA polymer- 
ase, may be detected by including within each PCR an 
inefficiently amplifying marker. This marker Tcsults in a 
fluorescence increase only after a large number of cy- 
cles — many more than are necessary to detect a true 
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positive. If a sample fails w have a fluorescence increase 
alter this many cycles, inhibition may be suspected. Since, 
in this assay, conclusions are drawn based on the presence 
or absence of fluorescence signal alone, such controls may 
be important. In any event, before any test based on this 
principle is ready for the clinic, an assessment of its false 
positiveflalse negative rates wDl need to be obtained using 
a large number of known samples. 

In summary, the inclusion in PCR of dyes whose fluo- 
rescence is enhanced upon binding dsDNA makes it 
possible to detect specific DNA amplification from outside 
he PCR tube. In the future, instruments based upon this 
>rindple may facilitate the more widespread use of PCR 
n applications that demand the high throughput of 
samples. 

EXPERIMENTAL PROTOCOL , 

Human HLA-DQa gene Amplifications containing EtBr. 
PCRs were set up in 100 14 volomes confining 10 mM Tris^HCl, 
pH 8.3; 50 mM KCI; 4 mM MgO,: 2.5 unitt of Taq DNA 
polymerase fPerltm-Ehncr Ccnu. Nonralk. CT); 20 prriofc eath 
of human HtA-DQa gene specific oligonucleotide primers 
(>H26 and CH27 19 and approaaitiately W copies of DQ& PCR 
product diluted from a previous reaction. Ethidium bromide 
(El Br; Signvt) was used at the concentrations indicated id Figure 
2. Thermocyding proceeded for 20 eveles in a model 480 
thermocydcr (Perkm-Qmer Com, Norwalk, CT) using a "rtcp- 
cycjjc" program of 94*C for 1 nun-dcnauirauon and Wv. Tot 30 
see. atmedrnK and 72°C for 30 sec. extension. 

Y-chromG-somc specific PCR. PCRs (JOO ul total reaction 
volume) containing t) J EtBr were prepared as described 
for HLA-DQo, except wim. dhTcrent primers and target DNAs. 
These PCRs contained 1 5 pmolc each male DNA-spceific primer* 
YI ) and Vl.2 M , arid cither 60 ng male, €0 tag female, 2 rig male. 
ot no human DNA. Thennocyeling was M°C Tor 1 min. and 60J5C 
for 1 min using a "step-cycle* program. The number of cycles tor 
a sample were as indicated in Figure 3. Fluorescence measure- 
ment is described below. , 

Allck -specific, homan £-glc*in groe PCR. Amplifications of 
100 jJ volume' using 0.5 »tg/ml of £tBr were prepared as 
described far HLA-JJQ* above except with difiereni prtmcrs and 
target DNA*. These PCRs contained either, primer pair HOPS' 
HBMA <wBd-type giobin specific primers) or HGP2/H(ll-tS (sick- 
lc-gtebin specific primers) at 10 pmole each pnmcr per FCR. 
These primers wt*e developed by Wu ct aL- 1 . Three different 
tttgei DNA* were tucd in separate amplifications— 60 ng each of 
human DMA that was homozygous for the sieUc t*»it <SS). DNA 
that was Ireterozraous for the sickle trail (AS), or DNA that was 
homozygous for the w.l- globin (AA). ThermocycHng w» for 30 
cycles at 94"C for 1 min. and WC for 1 min. using si "stcp-cyde 
projrram. An annealing temperature of 55°C b*<i heen shown try 
Wu et al. 21 10 provide allclc-spcrifk awpKfteatiori. Completed 
PCRs were pr*^aphcd through a «d fiter <Wra«ien_2SA) 
after dado/ the racoon tubes atop a model TM-S6 transffluWi- 
nator <0 V-products San-Gabriel, CA>. 

FhiOTesctnce measnremeW. Fluore*o»»ec roeasurcmerii.'i were 
made on PCRs containing EtBr in a nuoroloc-2 Boorometcr 
(SFEX. Edison. NJ). Excitation was at the 500 nm band wift 
ihotn 2 nm bandwidth with a GO 455 nm e"-«^««''P**S 
Crist Inc.. Irvine. CA> to exclude second-order light. Emiitea 
Ught was detected a( 570 nm with a band«iddl of about 7 nm. An 
OG 530 urn cut-off fitter was used to remove the exataoon ngMt. 

Conb'tHtoax (Toorescence raoaitoring of FCR. Coiianuoui 
monitorins; of a PCR in progress was accomplubed using Oi c 
tpeco-ofiuSromeiet and settings described Above as well a* a 
nberoptic accessory (SFEX cat. no. 1950) JO both send excrtauou 
Bght to. and receive emitted Ught from, a PCR placed m a wclloi 
a model «0 tfiwmocydcr (Pcrkm-Elmer Cetus). The probe eno 
of the fiberoptic cable was attached with "5 nw.utc-cpoxy to «* 
open top of a PCR tube (a 0.5 ml polypropylene ccntnttffic WW 
whh its cap removed) effectively scaling it. The exposed, top oi 
the PCR tube and the end of the fiberoptic caWe were slijcWeo 
from room Ught and the room lights were kept dunwied durmg 
each run. The monitored PCR was an awpWcauOn of y-cbrc- 
mosomMpedfic repeat sequences as ck^bed above. cxce|H 
usirjffan anneab'ngtectensicw lenmerauirc of 5(TC. The reaction 
was covered withtoberal oil (2 drops) to prevent evaporation. 
TTKiTOOcyding and Suorcsecncc measurement were «t?r<^ 
multancously. A time-base se»n with a 10 second integration tune 
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uas UBid and the emission signal wis ratioed to' toe excitation 
niROiO to control foe change* in li^hwourcc intensity. Data were 
Reeled "sing the droSWOf, version S.5 (SPEX) data system- 
Ackswvri<*tamentt 
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Trauma, Shock and Sepsis 




The CD-14 molecule is' expressed on the surface of 
monocytes and some macrophages. Membrane- 
bound CD-14 is a receptor for I ipopoty saccharide 
(LPS) complexed to LPS-Binding-Protein (LBP). The 
concentrailon of its soluble form is aftered under 
certain pathological conditions. There- is evidence for 
an important role of sCD-14.vvith pofytrauma, sepsis, 
burnings and infiamrnations. 
During septic condifions and acute infections it seems 
to be a prognostic marker and is therefore of value in 
monttortng these patients. 



1BL offers an ELISA for quantitative determination of 

soluble CD-14 in human serum, -plasma, cell-cijtMre 

supematants and other biological fluids. 

Assay features: 12x8 determinations 
(microliter strips), 
precoated with a specific 
monoclonal antibody, 
2x1 hour incubation, 
standard range: 3-96 ng/ml 
detection limit: 1 ng/ml 
CV: intra- and interassay < 8% 
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Oligonucleotides with Fluorescent Dyes at 
Opposite Ends Provide a Quenched Probe 
System Useful for Detecting PCR Product 
and Nucleic Acid Hybridization 

Kenneth j. Livak, Susan J.A. Flood, Jeffrey Marmaro, William Giusti, and Karin Deetz 

Perkin-Elmer, Applied Biosystems Division, Foster City, California 94404 



The 5' nuclease PCR assay detects the 
accumulation of specific PCR product 
by hybridization and cleavage of a 
double-labeled fluorogenlc probe 
during the amplification reaction. 
The probe is an oligonucleotide with 
both a reporter fluorescent dye and a 
quencher dye attached. An Increase 
In reporter fluorescence intensity in- 
dicates that the probe has hybridized 
to the target PCR product and has 
been cleaved by the 5' -»3' nucle- 
olytic activity of Taq DNA polymerase. 
In this study, probes with the 
quencher dye attached to an internal 
nucleotide were compared with 
probes with the quencher dye at- 
tached to the 3 '-end nucleotide. In all 
cases, the reporter dye was attached 
to the 5' end. All intact probes 
showed quenching of the reporter 
fluorescence. In general, probes with 
the quencher dye attached to the 3'- 
end nucleotide exhibited a larger sig- 
nal In the 5' nuclease PCR assay than 
the internally labeled probes. It Is 
proposed that the larger signal Is 
caused by Increased likelihood of 
cleavage by Taq DNA polymerase 
when the probe is hybridized to a 
template strand during PCR. Probes 
with the quencher dye attached to 
the 3 '-end nucleotide also exhibited 
an increase in reporter fluorescence 
Intensity when hybridized to a com- 
plementary strand. Thus, oligonucle- 
otides with reporter and quencher 
dyes attached at opposite ends can 
be used as homogeneous hybridiza- 
tion probes. 



•A homogeneous assay for detecting 
the accumulation of specific PCR prod- 
uct that uses a double-labeled fluoro- 
genic probe was described by Lee et al. <l) 
The assay exploits the 5'-»3' nucle- 
olytic activity of Taq DNA poly- 
merase^ and is diagramed in Figure 1. 
The fluorogenlc probe consists of an oli- 
gonucleotide with a reporter fluorescent 
dye, such as a fluorescein, attached to 
the 5' end; and a quencher dye, such as a 
rhodamine, attached internally. When 
the fluorescein is excited by irradiation, 
its fluorescent emission will be 
quenched if the rhodamine is close 
enough to be excited through the pro- 
cess of fluorescence energy transfer 
(FET). (4 ' 5) During PCR, if the probe is hy- 
bridized to a template strand, Taq DNA 
polymerase will cleave the probe be- 
cause of its inherent 5' —* 3' nucleolytic 
activity. If the cleavage occurs between 
the fluorescein and rhodamine dyes, it 
causes an increase in fluorescein fluores- 
cence intensity because the fluorescein 
is no longer quenched. The increase in 
fluorescein fluorescence intensity indi- 
cates that the probe-specific PCR product 
has been generated. Thus, FET between a 
reporter dye and a quencher dye is criti- 
cal to the performance of the probe in 
the 5' nuclease PCR assay. 

Quenching is completely dependent 
on the physical proximity of the two 
dyes. <6) Because of this, it has been as- 
sumed that the quencher dye must be 
attached near the 5' end. Surprisingly, 
we have found that attaching a rho- 
damine dye at the 3' end of a probe 
still provides adequate quenching for 
the probe to perform in the 5' nuclease 



PCR assay. Furthermore, cleavage of this 
type of probe is not required to achieve 
some reduction in quenching. Oligonu- 
cleotides with a reporter dye on the 5' 
end and a quencher dye on the 3' end 
exhibit a much higher reporter fluores- 
cence when double-stranded as com- 
pared with single-stranded. This should 
make it possible to use this type of dou- 
ble-labeled probe for homogeneous de- 
tection of nucleic acid hybridization. 



MATERIALS AND METHODS 
Oligonucleotides 

Table 1 shows the nucleotide sequence 
of the oligonucleotides used in this 
study. Linker arm nucleotide (LAN) 
phosphoramidite was obtained from 
Glen Research. The standard DNA phos- 
phoramidites, 6-carboxyfluorescein (6- 
FAM) phosphoramidite, 6-carboxytet- 
ramethylrhodamine succinimidyl ester 
(TAMRA NHS ester), and Phosphalink 
for attaching a 3 '-blocking phosphate, 
were obtained from Perkin-Elmer, Ap- 
plied Biosystems Division. Oligonucle- 
otide synthesis was performed using an 
ABI model 394 DNA synthesizer (Applied 
Biosystems). Primer and complement 
oligonucleotides were purified using 
Oligo Purification Cartridges (Applied 
Biosystems). Double-labeled probes were 
synthesized with 6-FAM-labeled phos- 
phoramidite at the 5' end, LAN replacing 
one of the T's in the sequence, and Phos- 
phalink at the 3' end. Following de- 
protection and ethanol precipitation, 
TAMRA NHS ester was coupled to the 
LAN-containing oligonucleotide in 2S0 
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Polymerization 



Forward 
Primer 



Probe 



Strand displacement 



Reverse 
Primer 



Cleavage 




Polymerization completed 




5'. 

3'- 

5'- 



-3' 
■5' 



nut Na-bicaibonate buffer (pH 9.0) at 
room temperature. Unreacted dye was 
removed by passage over a PD-10 Sepha- 
dex column. Finally, the double-labeled 
probe was purified by preparative high- 
performance liquid chromatography 
(HPLC) using an Aquapore C B 220x4.6- 
mm column with 7-pm particle size. The 
column was developed with a 24-min 
linear gradient of 8-20% acetonitrile in 
0.1 m TEAA (triethylamine acetate). 
Probes are named by designating the se- 
quence from Table 1 and the position of 
the IAN-TAMRA moiety. For example, 
probe Al-7 has sequence Al with LAN- 
TAMRA at nucleotide position 7 from the 
5' end. 



PCR Systems 

All PCR amplifications were performed 
in the Perkin-Elmer GeneAmp PCR Sys- 
tem 9600 using 50-m-I reactions that con- 
tained 10 mM Trls-HCl (pH 8.3), 50 mM 
KC1, 200 p.M dATP, 200 u« dCTP, 200 um 
dGTP, 400 p-M dUTP, 0.5 unit of AmpEr- 
ase uracil N-glycosylase (Perkin-Elmei), 
and 1.25 unit of AmpliTaq DNA poly- 
merase (Perkin-Elmer). A 295-bp seg- 
ment from exon 3 of the human p-actin 



gene (nucleotides 2141-2435 in the se- 
quence of Nakajlma-U|ima et al.) (7) was 
amplified using primers AFP and ARP 
(Table 1), which are modified slightly 
from those of du Breuil et al. (8) Actin am- 
plification reactions contained 4 mM 
MgCl 2 , 20 ng of human genomic DNA, 
SO nvi Al or A3 probe, and 300 nM each 



primer. The thermal regimen was 50°C 
(2 min), 95°C (10 min), 40 cycles of 95'C 
(20 sec), 60°C (1 min), and hold at 72°C. 
A 515-bp segment was amplified from a 
plasmid that consists of a segment of X 
DNA (nucleotides 32,220-32,747) in- 
serted in the Smal site of vector pUC119. 
These reactions contained 3.5 mM 
MgClz, 1 ng of plasmid DNA, 50 nM P2 or 
P5 probe, 200 nM primer FU9, and 200 
nM primer R119. The thermal regimen 
was 50°C (2 min), 95°C (10 min), 25 cy- 
cles of 95°C (20 sec), 57°C (1 min), and 
hold at 72°C. 



Fluorescence Detection 

For each amplification reaction, a 40-pJ 
aliquot of a sample was transferred to an 
individual well of a white, 96-well mlcro- 
titer plate (Perkin-Elmer). Fluorescence 
was measured on the Perkin-Elmer Taq- 
Man LS-50B System, which consists of a 
luminescence spectrometer with plate 
reader assembly, a 485-nm excitation fil- 
ter, and a 51S-nm emission filter. Excita- 
tion was at 488 nm using a 5-nm slit 
width. Emission was measured at 518 
nm for 6-FAM (the reporter or R value) 
and 582 nm forTAMRA (the quencher or 
Q value) using a 10-nm slit width. To 
determine the increase in reporter emis- 
sion that is caused by cleavage of the 
probe during PCR, three normalizations 
are applied to the raw emission data. 
First, emission intensity of a buffer blank 
Is subtracted for each wavelength. Sec- 
ond, emission intensity of the reporter is 



ACCCACAGGAACTGATCACCACTC 
ATGTCGCGTTCCGGCTGACGTTCTGC 
TCGCATTACTGATCGTl'GCCAACCAGTp 
GTACTGGTTGGCAACGATCAGTAATGCGATG 

CGGA'lTTGCTGGTATCTATGACAAGGATjp 
TTCATCCTTGTCATAGATACCAGCAAATCCG 

TCACCCACACTGTGCCCATCTACGA 
CAGCGGAACCGCrCATTGCCAATGG 
ATGCCCTCCCCCATGCCATCCTGCGTp 
AGACGCAGGATGGCATGGGGGAGGGCATAC 

CGCCC rGGACTTCGAGCAAGAGATjp 
CCATCTCTTGCTCGAAGTCCA GGGCGAC 

^each oligonucleotide used in this study the n^c^qu-ce 
Led for a T. (p) The presence of a 3' phosphate on each probe. 



F119 
R119 
P2 
P2C 
P5 
P5C 
AFP 
ARP 
Al 
A1C 
A3 
A3C 



primer 
primer 
probe 

complement 
probe 

complement 
primer 
primer 
probe 

complement 
probe 

complement 
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A1-2 RAQGCCCTCCCCCATGCCATCCTGCCPp 

A1-7 RATGCCCQC'CCCCATGCCATCCTGCGTp 

A1-14 RATGCCCTCCCCCAQGCCATCCTGCGTp 

A1-19 RatgccctcccccatgccaQcctgcgtp 

A1-22 RATGCCCTCCCCCATGCCATCCQGCGTp 

A1-26 RatgccctcccccatgccatcctgcgQp 



Probe 


518 


nm 


582 nm 


RQ- 


RQ+ 


ARQ 




no temp. 


♦ temp. 


no temp. 


+ temp. 








A1-2 


25.5 ±2.1 


32.7 ±1.9 


38.2 ± 3.0 


38.2 ±2.0 


0.67 + 0.01 


0.86 ±0.06 


0.19 + 0.06 


A1-7 


S3.S±4.3 


395.1 ±21.4 


108.5 ±6.3 


110.3 ±5.3 


0.49 ±0.03 


3.58 ±0.17 


3.09 ±0.18 


A1-14 


127.0±4.9 


403.5 ±19.1 


109.7 ± 5.3 


93.1 ±6.3 


1.16 ±0.02 


4.34±0.15 


3.18 ±0.15 


A1-19 


187.5 ±17.9 


422.7 ±7.7 


70.3 ±7.4 


73.0 ± 2.8 


2.67 ±0.05 


5.80 ±0.15 


3.13 ±0.16 


A1-22 


224.6 19.4 


482.2 ± 43.6 


100.0 ±4.0 


96.2 + 9.6 


2.25 ± 0.03 


5.02±0.11 


2.77 ±0.12 


A1-26 


160.2 ±8.9 


454.1 ± 18.4 


93.1 ± 5.4 


90.7 ±32 


1.72 ±0.02 


5.0110.08 


3.29 ±0.08 



FIGURE 2 Results of 5' nuclease assay comparing B-actin probes with TAMRA at different nucle- 
otide positions. As described in Materials and Methods, PCR amplifications containing the in- 
dicated probes were performed, and the fluorescence emission was measured at 518 and 582 nm. 
Reported values are the average± 1 s.D. for six reactions run without added template (no temp.) 
and six reactions run with template (+temp.). The RQ ratio was calculated for each individual 
reaction and averaged to give the reported RQ" and RQ + values. 



divided by the emission intensity of the 
quencher to give an RQ ratio for each 
reaction tube. This normalizes for well- 
to-well variations in probe concentra- 
tion and fluorescence measurement. Fi- 
nally, ARQ is calculated by subtracting 
the RQ value of the no-template control 
(RQ") from the RQ value for the com- 
plete reaction including template 
(RQ*). 

RESULTS 

A series of probes with increasing dis- 
tances between the fluorescein reporter 
and rhodamine quencher were tested to 
investigate the minimum and maximum 
spacing that would give an acceptable 
performance in the 5' nuclease PCR as- 
say. These probes hybridize to a target 



sequence in the human p-actin gene. 
Figure 2 shows the results of an experi- 
ment in which these probes were in- 
cluded In PCR that amplified a segment 
of the f>actin gene containing the target 
sequence. Performance in the 5' nu- 
clease PCR assay is monitored by the 
magnitude of ARQ which is a measure 
of the increase in reporter fluorescence 
caused by PCR amplification of the 
probe target. Probe Al-2 has a ARQ value 
that Is close to zero, indicating that the 
probe was not cleaved appreciably dur- 
ing the amplification reaction. This sug- 
gests that with the quencher dye on the 
second nucleotide from the 5' end, there 
is insufficient room for Taq polymerase 
to cleave efficiently between the reporter 
and quencher. The other five probes ex- 
hibited comparable ARQ values that are 



clearly different from zero. Thus, all five 
probes are being cleaved during PCR am- 
plification resulting in a similar increase 
in reporter fluorescence. It should be 
noted that complete digestion of a probe 
produces a much larger increase in re- 
porter fluorescence than that observed 
in Figure 2 (data not shown). Thus, even 
in reactions where amplification occurs, 
the majority of probe molecules remain 
uncleaved. It is mainly for this reason 
that the fluorescence intensity of the 
quencher dye TAMRA changes little with 
amplification of the target. This is what 
allows us to use the 582-nm fluorescence 
reading as a normalization factor. 

The magnitude of RQ" depends 
mainly on the quenching efficiency in- 
herent in the specific structure of the 
probe and the purity of the oligonucle- 
otide. Thus, the larger RQ~ values indi- 
cate that probes Al-14, Al-19, Al-22, and 
Al-26 probably have reduced quenching 
as compared with Al-7. Still, the degree 
of quenching is sufficient to detect a 
highly significant increase in reporter 
fluorescence when each of these probes 
is cleaved during PCR. 

To further investigate the ability of 
TAMRA on the 3' end to quench 6-FAM 
on the 5' end, three additional pairs of 
probes were tested in the 5' nuclease 
PCR assay. For each pair, one probe has 
TAMRA attached to an internal nucle- 
otide and the other has TAMRA attached 
to the 3' end nucleotide. The results are 
shown in Table 2. For all three sets, the 
probe with the 3' quencher exhibits a 
ARQ value that is considerably higher 
than for the probe with the internal 
quencher. The RQ" values suggest that 
differences in quenching are not as great 
as those observed with some of the Al 
probes. These results demonstrate that a 
quencher dye on the 3' end of an oligo- 
nucleotide can quench efficiently the 



TABLE 2 Results of 5' Nuclease Assay Compari ng Probes with TAMRA Attached to an Internal or 3'-terminal Nucleotide 
518 nm 582 nm 



Probe 


no temp. 


+ temp. 


no temp. 


+ temp. 


RQ" 


RQ + 


ARQ 


A3-6 
A3-24 


54.6 ± 3.2 
72.1 ± 2.9 


84.8 ±3.7 
236.5 ± 11.1 


116.2 ± 6.4 
84.2 ± 4.0 


115.6 ± 2.5 
90.2 ± 3.8 


0.47 ± 0.02 
0.86 ± 0.02 


0.73 ± 0.03 
2.62 ± 0.0S 


0.26 ± 0.04 
1.76 ± 0.05 


P2-7 
P2-27 


82.8 ± 4.4 
113.4 + 6.6 


384.0 ± 34.1 
555.4 ± 14.1 


105.t ± 6.4 
140.7 ± 8.5 


120.4 ± 10.2 
118.7 ± 4.8 


0.79 ± 0.02 
0.81 ± 0.01 


3.19 + 0.16 
4.68 ± 0.10 


2.40 + 0.16 
3.88 ± 0.10 


PS- 10 
PS-28 


77.5 ± 6.5 
64.0 ± 5.2 


244.4 ± 15.9 
333.6 ±12.1 


86.7 ± 4.3 
100.6 ± 6.1 


95.8 ± 6.7 
94.7 ± 6.3 


0.89 ± 0.05 
0.63 ± 0.02 


2.55 ± 0.06 
3.53 ± 0.12 


1.66 ± 0.08 
2.89 ± 0.13 



Reactions containing the indicated probes and calculations were performed as described in Material and Methods and in the legend to Fig. 2. 
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fluorescence of a reporter dye on the 5' 
end. The degree of quenching is suffi- 
cient for this type of oligonucleotide to 
be used as a probe in the 5' nuclease PCR 
assay. 

To test the hypothesis that quenching 
by a 3' TAMRA depends on the flexibility 
of the oligonucleotide, fluorescence was 
measured for probes in the single- 
stranded and double-stranded states. Ta- 
ble 3 reports the fluorescence observed 
at 518 and 582 nm. The relative degree 
of quenching is assessed by calculating 
the RQ ratio. For probes with TAMRA 
6-10 nucleotides from the 5' end, there 
is little difference in the RQ values when 
comparing single-stranded with double- 
stranded oligonucleotides. The results 
for probes with TAMRA at the 3' end are 
much different. For these probes, hy- 
bridization to a complementary strand 
causes a dramatic increase in RQ. We 
propose that this loss of quenching is 
caused by the rigid structure of double- 
stranded DNA, which prevents the 5' 
and 3' ends from being in proximity. 

When TAMRA is placed toward the 3' 
end, there is a marked Mg 2 * effect on 
quenching. Figure 3 shows a plot of ob- 
served RQ values for the Al series of 
probes as a function of Mg 2 * concentra- 
tion. With TAMRA attached near the 5' 
end (probe Al-2 or Al-7), the RQ value at 
0 mM Mg 2 * is only slightly higher than 
RQ at 10 mM Mg 2 *. For probes Al-19, 
Al -22, and Al-26, the RQ values at 0 mM 
Mg 2 * are very high, indicating a much 



reduced quenching efficiency. For each 
of these probes, there is a marked de- 
crease in RQ at 1 mM Mg 2 * followed by 
a gradual decline as the Mg 2 * concen- 
tration increases to 10 dim. Probe Al-14 
shows an intermediate RQ value at 0 mM 
Mg 2 * with a gradual decline at higher 
Mg 2 * concentrations. In a low-salt en- 
vironment with no Mg 2 * present, a sin- 
gle-stranded oligonucleotide would be 
expected to adopt an extended confor- 
mation because of electrostatic repul- 
sion. The binding of Mg 2 * ions acts to 
shield the negative charge of the phos- 
phate backbone so that the oligonucle- 
otide can adopt conformations where 
the 3' end is close to the 5' end. There- 
fore, the observed Mg 2 * effects support 
the notion that quenching of a 5' re- 
porter dye by TAMRA at or near the 3' 
end depends on the flexibility of the oli- 
gonucleotide. 

DISCUSSION 

The striking finding of this study is that 
it seems the rhodamlne dye TAMRA, 
placed at any position in an oligonucle- 
otide, can quench the fluorescent emis- 
sion of a fluorescein (6-FAM) placed at 
the 5' end. This Implies that a single- 
stranded, double-labeled oligonucle- 
otide must be able to adopt conforma- 
tions where the TAMRA is close to the 5' 
end. It should be noted that the decay of 
6-FAM in the excited state requires a cer- 
tain amount of time. Therefore, what 



TABLE 3 Comparison of Fluorescence Emissions of Single-stranded and 
Double-stranded Fluorogenic Probes ___ 



S18 nm 



582 nm 



RQ 



Probe 



ss 



ds 



ss 



ds 



ds 



Al-7 

Al-26 

A3-6 

A3-24 

P2-7 

rz-27 

P5-10 
P5-28 



27.75 
43.31 
16.75 
30.05 
35.02 
39.89 
27.34 
33.65 



68.53 
509.38 

62.88 
578.64 

70.13 
320.47 
144.85 
462.29 



61.08 
53.50 
39.33 
67.72 
S4.63 
65.10 
61.95 
72.39 



138.18 
93.86 
165.57 
140.25 
121.09 
61.13 
165.54 
104.61 



0.45 
0.81 
0.43 
0.45 
0.64 
0.61 
0.44 
0.46 



0.50 
5.43 
0.38 
3.21 
0.58 
5.25 
0.87 
4.43 



(ss) Single-stranded. The fluorescence emissions at 518 or 582 nm for solutions containing a final 
S£Z£. Tot 50 n M indicated probe, 10 mM Tris-HCl (pH 8.3), 50 mM KG, and 10 n. MgO* 
ds) Double-stranded. The solutions contained, in addition, 100 nM A1C for probe, Al-T^d 
Al-26 100 nM A3C for probes A3-6 and A3-24, 100 om P2C for probes P2-7 and P2-27, or 100 dm 
iicfa Ss PS-10 and PS-28. Before the addition of MgCI 2 . 120 ^1 of each sample was heated 
If 95T forTmuf. Following the addition of 80 H of 25 mM MgCl 2 , each .ample was aUowed to 
cool to room temperature and the fluorescence emissions were measured. Reported values are 
the average of three determinations. 



matters for quenching is not the average 
distance between 6-FAM and TAMRA 
but, rather, how close TAMRA can get to 
6-FAM during the lifetime of the 6-FAM 
excited state. As long as the decay time of 
the excited state is relatively long com- 
pared with the molecular motions of the 
oligonucleotide, quenching can occur. 
Thus, we propose that TAMRA at the 3' 
end, or any other position, can quench 
6-FAM at the 5' end because TAMRA is in 
proximity to 6-FAM often enough to be 
able to accept energy transfer from an 
excited 6-FAM. 

Details of the fluorescence measure- 
ments remain puzzling. For example, Ta- 
ble 3 shows that hybridization of probes 
Al-26, A3-24, and P5-28 to their comple- 
mentary strands not only causes a large 
Increase in 6-FAM fluorescence at 518 
nm but also causes a modest increase in 
TAMRA fluorescence at 582, nm. If 
TAMRA is being excited by energy trans- 
fer from quenched 6-FAM, then loss of 
quenching attributable to hybridization 
should cause a decrease in the fluores- 
cence emission of TAMRA. The fact that 
the fluorescence emission of TAMRA in- 
creases indicates that the situation is 
more complex. For example, we have an- 
ecdotal evidence that the bases of the 
oligonucleotide, especially G, quench 
the fluorescence of both 6-FAM and 
TAMRA to some degree. When double- 
stranded, base-pairing may reduce the 
ability of the bases to quench. The pri- 
mary factor causing the quenching of 
6-FAM in an intact probe is the TAMRA 
dye. Evidence for the importance of 
TAMRA is that 6-FAM fluorescence 
remains relatively unchanged when 
probes labeled only with 6-FAM are used 
in the 5' nuclease PCR assay (data not 
shown). Secondary effectors of fluores- 
cence, both before and after cleavage of 
the probe, need to be explored further. 

Regardless of the physical mecha- 
nism, the relative independence of posi- 
tion and quenching greatly simplifies 
the design of probes for the 5' nuclease 
PCR assay. There are three main factors 
that determine the performance of a 
double-labeled fluorescent probe in the 
5' nuclease PCR assay. The first factor is 
the degree of quenching observed in the 
intact probe. This is characterized by the 
value of RQ" , which is the ratio of re- 
porter to quencher fluorescent emis- 
sions for a no template control PCR. In- 
fluences on the value of RQ" include 
the particular reporter and quencher 
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mM Mg 

FIGURE 3 Effect of Mg 2+ concentration on RQ ratio for the Al seiies of probes. The fluorescence 
emission intensity at 518 and 582 nm was measured foT solutions containing 50 nM probe, 10 mM 
Trls-HCl <pH 8.3), 50 mM KG, and varying amounts (0-10 mM) of MgC! 2 . The calculated RQ 
ratios (518 nm intensity divided by 582 nm intensity) are plotted vs. MgCl 2 concentration (mM 
Mg): The key (upper right) shows the probes examined. 



dyes used, spacing between reporter and 
quencher dyes, nucleotide sequence 
context effects, presence of structure or 
other factors that reduce flexibility of 
the oligonucleotide, and purity of the 
probe. The second factor is the efficiency 
of hybridization, which depends on 
probe T m , presence of secondary struc- 
ture in probe or template, annealing 
temperature, and other reaction condi- 
tions. The third factor is the efficiency at 
which Taq DNA polymerase cleaves the 
bound probe between the reporter and 
quencher dyes. This cleavage is depen- 
dent on sequence complementarity be- 
tween probe and template as shown by 
the observation that mismatches in the 
segment between reporter arid quencher 
dyes drastically reduce the cleavage of 
probe/ 0 

The rise in RQ" values for the Al se- 
ries of probes seems to indicate that the 
degree of quenching is reduced some- 
what as the quencher is placed toward 
the 3' end. The lowest apparent quench- 
ing is observed for probe Al-19 (see Fig. 
3) rather than for the probe where the 
TAMRA is at the 3' end (Al-26). This is 
understandable, as the conformation of 
the 3' end position would be expected to 
be less restricted than the conformation 
of an internal position. In effect, a 
quencher at the 3' end is freer to adopt 
conformations close to the 5' reporter 
dye than is an internally placed 
quencher. For the other three sets of 



probes, the interpretation of RQ~ values 
is less clear-cut. The A3 probes show the 
same trend as Al, with the 3' TAMRA 
probe having a larger RQ~ than the in- 
ternal TAMRA probe. For the P2 pair, 
both probes have about the same RQ~ 
value. For the P5 probes, the RQ~ for the 
3' probe is less than for the internally 
labeled probe. Another factor that may 
explain some of the observed variation is 
that purity affects the RQ~ value. Al- 
though all probes are HPLC purified, a 
small amount of contamination with 
unquenched reporter can have a large ef- 
fect on RQ". 

Although there may be a modest ef- 
fect on degTee of quenching, the posi- 
tion of the quencher apparently can 
have a large effect on the efficiency of 
probe cleavage. The most drastic effect is 
observed with probe Al-2, where place- 
ment of the TAMRA on the second nu- 
cleotide reduces the efficiency of cleav- 
age to almost zero. For the A3, P2, and PS 
probes, ARQ is much greater for the 3' 
TAMRA probes as compared with the in- 
ternal TAMRA probes. This is explained 
most easily by assuming that probes 
with TAMRA at the 3' end are more likely 
to be cleaved between reporter and 
quencher than are probes with TAMRA 
attached internally. For the Al probes, 
the cleavage efficiency of probe Al-7 
must already be quite high, as ARQ does 
not increase when the quencher is 
placed closer to the 3' end. This illus- 



trates the importance of being able to 
use probes with a quencher on the 3' 
end in the 5' nuclease PCR assay. In this 
assay, an increase in the intensity of re- 
porter fluorescence is observed only 
when the probe is cleaved between the 
reporter and quencher dyes. By placing 
the reporter and quencher dyes on the 
opposite ends of an oligonucleotide 
probe, any cleavage that occurs will be 
detected. When the quencher is attached 
to an internal nucleotide, sometimes the 
probe works well (Al-7) and other times 
not so well (A3-6). The relatively poor 
performance of probe A3-6 presumably 
means the probe is being cleaved 3' to 
the quencher rather than between the 
reporter and quencher. Therefore, the 
best chance of having a probe that reli- 
ably detects accumulation of PCR prod- 
uct in the S' nuclease PCR assay is to use 
a probe with the reporter and quencher 
dyes on opposite ends. 

Placing the quencher dye on the 3' 
end may also provide a slight benefit in 
terms of hybridization efficiency. The 
presence of a quencher attached to an 
internal nucleotide might be expected to 
disrupt base-pairing and reduce the T m 
of a probe. In fact, a 2 0 C-3°C reduction 
in T m has been observed for two probes 
with internally attached TAMRAs. w This 
disruptive effect would be minimized by 
placing the quencher at the 3' end. Thus, 
probes with 3' quenchers might exhibit 
slightly higher hybridization efficiencies 
than probes with internal quenchers. 

The combination of increased cleav- 
age and hybridization efficiencies means 
that probes with 3' quenchers probably 
will be more tolerant of mismatches be- 
tween probe and target as compared 
with internally labeled probes. This tol- 
erance of mismatches can be advanta- 
geous, as when trying to use a single 
probe to detect PCR-amplified products 
from samples of different species. Also, it 
means that cleavage of probe during PCR 
is less sensitive to alterations In an- 
nealing temperature or other reaction 
conditions. The one application where 
tolerance of mismatches may be a disad- 
vantage is for allelic discrimination. Lee 
et al. (1> demonstrated that allele-spedfic 
probes were cleaved between reporter 
and quencher only when hybridized to a 
perfectly complementary target. This al- 
lowed them to distinguish the normal 
human cystic fibrosis allele from the 
AFS08 mutant. Their probes had TAMRA 
attached to the seventh nucleotide from 
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the 5' end and were designed so that any 
mismatches were between the reporter 
and quencher. Increasing the distance 
between reporter and quencher would 
lessen the disruptive effect of mis- 
matches and allow cleavage of the probe 
on the incorrect target. Thus, probes 
with a quencher attached to an internal 
nucleotide may still be useful for allelic 
discrimination. 

In this study loss of quenching upon 
hybridization was used to show that 
quenching by a 3' TAMRA is dependent 
on the flexibility of a single-stranded oli- 
gonucleotide. The increase in reporter 
fluorescence intensity, though, could 
also be used to determine whether hy- 
bridization has occurred or not. Thus, 
oligonucleotides with reporter and 
quencher dyes attached at opposite ends 
should also be useful as hybridization 
probes. The ability to detect hybridiza- 
tion in real, time means that these probes 
could be used to measure hybridization 
kinetics. Also, this type of probe could be 
used to develop homogeneous hybrid- 
ization assays for diagnostics or other ap- 
plications. Bagwell et al.< 10 > describe just 
this type of homogeneous assay where 
hybridization of a probe causes an in- 
crease in fluorescence caused by a loss of 
quenching. However, they utilized a 
complex probe design that requires add- 
ing nucleotides to both ends of the 
probe sequence to form two Imperfect 
hairpins. The results presented here 
demonstrate that the simple addition of 
a reporter dye to one end of an oligonu- 
cleotide and a quencher dye to the other 
end generates a fluorogenic probe that 
can detect hybridization or PCR amplifi- 
cation. 
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Quantitative nucleic acid sequence analysis has 
had an important mle in many fields of biologi- 
cal research. Mcasmement of gene expression 
(RNA) has bmm used extensively In monitoring 
biological responses lo various slimuli Clan el al, 
1994; Huang el al. I995a,b; Prud'homme et al. 
1995). Quantitative gen* analysis (DNA) has 
Ix-en used Ui slttiermine the genome quantity of a 
particular gene, as in the case, or the human HKK2 
gene, which Is amplified in -30% of breast tu- 
mors (Slamon et al. 1987). Gene and genome 
quantitation (DNA and UNA) also have been used 
for analysis of human immunodeficiency virus 
(J1JV) buTden demonstrating changes in the lev- 
els of vims throughout the different phases of the 
disease (Connor «t al. 1993; 1'lHtak el al. jvwib; 
Purtado et al. 1995)- 

Many methods have been described for tin: 
quantitative analysis ot nucleic acid sequences 
(both for RNA and DNA; Southern 19/£>; Sharp el 
al. 19B0; Thomas 1980). Recently, PCR has 
proven to be a powerful tool for quantitative 
nucleic acid analysis. PCJR and reverse transcrip- 
tase. (KT)-PC.R have permitted ihe analysis of 
minimal starting quantities of nucleic acid (as 
little as one cell equivalent). This has mode pos- 
sible many experiments that could not hove, been 
performed with traditional methods. Although 
PCR has provided a powerful tool, it is imperative 



RTn fiJi 



that It be umm! properly for quantitation (W-uy 
masters 1995). Many early sports of quanuu- 
tivi: PCR ami RT-I'CR described quantitation of 
the PCR product but did not measure the Initial 
target sequence quantity. II is essential to design 
pn>i>cr controls for Ihe quantitation of the initial 
target sequences (I'crrc 1992; Clement I et al. 
10!)?.) 

Re.vMifchcrs have, developed several methods 
of quantitative PCR and RT-PCR. One approach 
measures PCR product quantity in the log phase 
of ihe reunion before the plateau (Kellogg et al. 
1990; Vang ct a). 1990). This method requires 
thai each sample has equal input amounts of 
nucleic add and that each sample under analysis 
amplifies with idv.nl ic*l efficiency up to the. point 
of quuniiUlivc analysis. A gene sequence (ram. 
tallied in nil samples at relatively constant quan- 
tity, such as p-aclln) cum be used for sample, 
amplification efficiency normalization. Using 
conventional rnathods of PCR detection and 
quantitation (gd electrophoresis or plate capture 
hybridization), it is exi remcly laborious to assure 
that all samples are analyzed during the log phase 
of the reaction (for bolh the. target gene and ihe 
normalization gene). Another method, quantita- 
tive competitive (QQ-PCR, has been developed 
and is used widely for PCR quantitation. QC-PCR 
rHics on ihe inclusion of an internal control 
competitor in each reaction (Becker-Andre 1991; 
Harak a al. I093o,b). The efficiency of each re- 
action is normalised lo the internal competitor. 
a bnnwn am. mill of im«-.maJ competitor can be 
annra rncR nai r*r wj rc:»t 7nnr/cn/7T 
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added to each sample. To obtain relative nuanl- 
tatlon, the unknown target PGR product is com- 
pared with the known competitor If.K product. 
Success of a quantitative competitive I'CU assay 
relies on developing an internal wniml lhal am- 
plin«s with the same efficiency as the Uugul mol- 
ecule. The design of the coiupetltoi and the vali- 
dation of amplification efficiencies jcquirc a 
dedicated effort. However, because QC-IKIR does 
not require that PC'.R products be analyzed during 
the lof; phase of Ihc. amplification, it is tins easier 
of tlie two methods to use. 

Several detection system* uiv uwil for quan 
Utative l-'CR and RT-l'C'.l* analysis; (1) agarose 
gels, (2) fluorescent labeling of 1»C1U products and 
detection with In.nTr-inducr.d fluorescence using 
capillary electrophoresis (hiiseo et al. 1995; Wil- 
liams et al. 1996) or acrylamUle gels, and (3) plate 
capture and sandwich probe hybridization (Mul- 
der el al. 1994). Although these method* jmivrd 
successful, each method requires posl-l'CR ma- 
nipulations Thar add time U) the analysis and 
may lead In lauu'uluiy * niilnii'iination. The 
sample throughput of Ibevr method:. i-> limited 
(wllh I he exccpilon of the plate capture ap- 
proach), and, therefnri-., these methods ore not 
well suited fin u-"<i.t> demanding high sample 
throughput (I.e., screening of large muubers of 

lj|imti,lrvul» »»i aiialy/.lii^ .tamj/lva fox diagiio:.- 
tics or clinical trials). 

I lerc. w report th<: development of a novel 
iissay for quantitative DNA analysis. The assay is 
hascd on the me of the .5' nuclease assay first 
described by Holland et al. (1993). The method 
uses the 5' nuclease, activity of 7Vi</ polymerase to 
i:lc.avc a noncxtcndlblc hybridization probe dur- 
ing the extension phase of I'CU. The. approach 
uses dual-labeled fluorogenic hybridisation 
probes (Lee. el a). 19i>3; itusslcr ct al. 19^3; Uvok 
cl al. l995o,b). One. fluorescent dye serves as a 
•reporter [FAM (i.e., 6-carboxyfluoresccin)| arid its 
emission spectra is quenched by the second fluo- 
rescent dye, TAMRA (I.e., o-carboxy-letramethyl- 
lhodaminc). Tlic nuclease degradation of the hy- 
brld1y.urlon probe releases the quenching of Die 
I'AM fluorescent euiisskni, resulting in an In- 
crease, hi peak fluorescent emission at E>iH nin. 
The use or a sequence detector (A131 Prism) allows 
measurement of fluorescent spectra of all yo wells 
tif rhe'in'crmal eyelet continuously during the 
lOK amplification. Therefore, the rcueliuus uie 
inonltorcd in real lime. The output data is de- 
scribed and quantitative uunlysb of input Uigei 
DNA sequences is discussed below. 



RESULTS 

PCR Product Derealon in Real Time 

The goal was to develop a high-throughput, sen- 
sitive, and necuratc gene quantitation assay for 
use In monitoring lipid mediated therapeutic 
gene delivery. A plasinid encoding human factor 
VIII gone sequence, pP8TM (sec. Methods), was 
used as a model therapeutic The assay use* 
fluorescent Taqman methodology and an instru- 
ment capable of measuring fluorescence in real 
time (AM Prism. 7700 Sequence Dctrrlnr). The 
Taqnwi reaction requires n hybridization probe 
lalxled with two different fluorescent. dyes. One 
dye is a reporter dy« (I'AM), the other ix a quench- 
ing, dye (TAMRA). When the proU: is intact, fluo- 
icsccnl energy transfer occurs and the reporter 
dye fluorescent emission is ubsorbed by the 
quench Ins dvv (TAMRA) . During Die ex tension 
phase of the PGR cycle, the fluorescein hybrid- 
l/jjllor. probe is cleaved by the 5'-.V nudcolytic 
activity of the. DNA polymerase. On cleavage of 
the probe, the reporter dye emission is no longer 
transferred efficiently to the quenching dye, re 
suiting In an increase of the reporter dyu fluores- 
cent cini.i.ilon spectra. PCR primers and prubi» 
were designed foi lliu human factor Vlll se- 
quence and human p-actln gene, (as described in 
Methods). Optimization reactions were per- 
formed to choose the appropriate probe und 
magnesluni concentrations yielding tl»e holiest 
Intensity of reporter fluorescent signal without 
sacrificing specificity. The Instrument uses a 
chargc-couplcd device (i.e.. CCD camera) for 
measuring the fluorescent emission apeelni from 
SOO to rtSO nin. liach rc.lt tube was monitored 
sequentially for 2.1 msec with continuous moni- 
toring throughout the amplification, liftch lube 
wo.% rr.-cxontlncd every 8.S see. Computer soft- 
ware, was du-sipned 1o examine the flxioreseent In- 
tensity of both the reporter dye (PAM).and 
the quenching dye (TAMRA). The lluoresccrtt 
intensity of tbe quenching dye, TAMUA, changes 
very Utile over the course of the PCR amplifi- 
cation (data not shown). Therefore., the Intensity 
of TAMRA dye emission serves hk an internal 
llondttrd wllh which to noi-muUice the reporter 
dye (l : AM) emission variations. The software cal- 
culates j value termed ARn (or AROJ usIor the. 
rollowlng equation: ARn - (Hn J ) (R"'), where 
Kn 4 • emission intensity of leporter/emission in- 
tensity of quencher al any given time In o-reoc 
rloii lube, and Ru - emission intensilily of re- 
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poncr/cmlsstoii Intensity i>f qucm:Wer measured 
prior 10 I'CK amplilicatioii in that same reaction 
lube, l-'or the purpose of quantitation, the Usi 
three data points (ARilS) collected during The. ex- 
tension step for each 1 J C:K cycle were analyzed. 
Tlic nudeolylic degradation of inc. nyum1i*aiion 
probe occurs during the extension phase or i'i Ji, 
and, therefore, reporter fluorescent amauun in- 
creases during this time. n«e Uuw: data points 
were averaged for each cycle and the mean 
value for each was plotted in an "anipllHcatlon 
plot" shown in I'ltfure 1A. Tlic AKn mean value is 
plotted on the j'-axis, and time, represented by 
cycle number, is plot led on the .x-axis. Durln B the 
early cycle* of the VCM amplification, the AUn 



value remains at base line When sufficient hy- 
bridisation probe h.'iis been cleaved by the Tun 
polymerase nuflttlM activity, the inleiiiity of re- 
porter flu<nr.«:cni emission increasei.. \4osl IH.'U 
aiupliriv*Hoos read* u plateau phone of reporter 
fJuojes.cv.nl emission U the rcHuliun Is carried out 
10 high cycle uuinK-«s. The ampli Real Ion plot 1» 
examined euily in lh« reaction, ut a point lhat 
■ (.-presents ilu- log phase of product »rrmnula» 
lion. This is done by uttignlng an aibilj;uy 
threshold turn is based on the variability of the 
bast-line d H la. In Vigaw. 1 A, the IhrfiShold whs set 
at 10 standard deviations above, the mean of 
base line emission lalculated from iydv* 1 lo-1 5. 
Once the threshold is chosen, the point at which 
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F ,gure 1 PCR product detection in real Km*. ^^^Z^kISS 
from the extension phase fluorescent em.ss.on data collected during the PCR p)oL c vaUjfeS are 

viation is determined 1rom the data points collected from the base line 0 V^nrnpill ca °"P i 
emulated by determining the poin, a, which ^*»~™Z^ 

standard deviation of the base line). (8) Overlay ot amplification plots of se " a ">' Vjf 0 , '1 O loftcd versus C T All 
DNA samples amplified with p-actin primers. (Q Input DNA concentration ot ^J 3 ^ 1 ^ 

»U*iawc uU'iKIa\ 
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the amplification plot erobseo the threshold's ele 
fined as C,. C r is reported a* the cycle number ;« 
this point. As will be demonstrated, lh« CI, .value 
Is jnolit-iivt of the quantity of input tiir(je.t. 

Cj Values Provide a Quantitative Measurement . of 
Input Targer Sequences 

Figure IB shows amplification plots of ]ii<di¥fev- 

enl TCR amplifications overlaid. 'i"hf amplifica- 
tions wore performed on a 1 :2 serial dilution in.* 
human genomic 1WA. 'l"hc amplified tors'ci v.<m. 
human p actln. The amplification jjlofs shift to 
the right (to higher threshold cycles) n.i the. input 
(Argot quantity is reduced. 'Jhis is expected ho- 
kiwux HiaetlonK with fewer starting eopinti of the 
target molecule require greater amplification to 
degrade enough probe to attain the Threshold 
fluorescence. An arbitrary threshold of 10 stan- 
dard deviations above the base line was used to 
determine the O r values. Figure 1C represents the 
Clf values plotted versus the sample dilution 
value. Each dilution was amplified in triplicate 
PCM amplifications and plotted as moan values 
with error bars representing one standard devia- 
tion. The C T values decrease linearly with increas- 
ing target quantity. Thus, C r valuta can be used 
as a quantitative measurement of the- input target 
number, tt should be noted that tin* amplifica- 
tion plot for the 15.6- ng sample shown In Hgure 
IB does not reflect the same fluorescent rate of 
Increase exhibited by most of the other samples, 
'me 15.6-ng sample also achieves rmdnoinl pla- 
teau at a lower fluorescent value than would be 
expected based on the Input I>NA. This phenom. 
cnon has been observed occasionally with other 
samples (data not shown) and may be altribut- 
able to late, cycle inhibition; this hypothesis is 
still under investigation. It is important to note 
that the flattened slope and early plateau do not 
impact significantly the calculated C, value as 
demonstrated by the fil on Ihe line shown m 
Figure. 1 C. All triplicate amplifications resulted in 
v«ry similar G,- values— the standard deviation 
did not exceed 0.5 for any dilution. This experi- 
ment contains a > 1 00,000-fold range of Input tar- 
get molecules. Using C v values for quantitation 
permits a much larger assay range than directly 
using total fluorescent emission intensity for 
quantitation. The linear range ol iluorcsccnl in- 
tensity measurement of Hie Ahl J'llsin 7700 Se- 
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KM At 1IMI 0UAN1IU1IVI K.K 

moots over n very large puri' of rplativo start lop, 
target quantities. 

Sample Preparation Validation 

Several parameters influence the chlcU-nry uf 
PC'.R amplification: magnesium and salt conceit: 
nations, reaction conditions (i.e., time and tem- 
perature), I'CH target size and composition, 
primer sequences, and sample purity. All of The 
above (actors are common to a single J'CK assay, 
except sample to sample purity. In an effort to 
validate Ihe. method of sample preparation for 
the iactor VJil assay, J'CK amplitictition rcprodnc 
ihility and oil'lciency oi 10 replicate sample 
pre|>aratioTiK were, examined. After genomic DNA. 
was prepared from the 10 replicate samples, the 
UNA was quimUiatcd by ultraviolet spectroscopy. 
Amplifications were performed analyzing p-aciln 
gene, content in 100 and 25 n>> of total genomic 
DNA. Each I'CK amplification was performed in 
triplicate. Comparison of C r values for each trip- 
licate sample show minimal variation based on 
standard deviation and coefficient of variance 
(table 1). l*hercfore, each ol the triplicate PCR 
amplifications was highly reproducible, demon- 
strating that real time MJK using this instrumen- 
tation introduces minimal variation Into the 
quantitative J'CK analysis. Comparison of tlie 
mean G, values of the 10 replicate sample prepa- 
rations also showed minimal variability, indicat- 
ing that each sample preparation yielded similar 
results for p-aclin gene quantity. The highest C". T 
difference between any of the samples was 0.K5 
and 0.73 for Hie 100 and 2S samples, re.spe.c- 
llve.ly. Additionally, the. amplification of each 
sample exhibited an equivalent rale of fluores- 
cent emission intensity change per amount of 
DNA target analyzed ns indicated by similar 
slopes derived from Ihe sample diluiions (Pig. 2). 
Any sample containing an excess of a I'CK inhibi- 
tor would exhibit a greater measured fj-actin G r 
value for a given quantity of DNA. In addition, 
the inhibitor would be diluted along with ihe 
sample in the dilution analysis (1-ig. 2), altering 
the expected C,- value change. Each sample am- 
plification yielded a similar result in the analysis, 
demonstrating that this method of sample prepa- 
ration Is highly reproducible, with regard lo 
sample purity. 

Quantitative Analvsis of a Plasmid After 

7ncf» r\a i ahR wj rc:j>t 7nn7/cn/7T 
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Table 1 . 


Roprod 


uclblllty of Samplo 


Preparation Method 











100 ng 






25 ng 




Samplo 
no. 






standard 








standard 


cv 


c T 


mean 


deviation 


cv 


c T 


mean 


deviation 


1 


18.24 








20.48 










18.23 








20.55 






O 1 7 




10.33 


1 0.<£/ 


n ha 


0.32 


20.5 


20,51 


0.03 


2 


18.33 








20.61 










18.35 








20.59 






O <1 




1R.44 


1 C'» • f 


u.wo 


0.3? 


70.41 


P0..S4 


0.11 


3 


18.3 








20.54 










18.3 








20.6 






n or 

Vi£v 




18.42 


1 0. 5*r 


O 07 


0.36 


20.49 


20.54 


0.06 


4 


18.15 








20.48 










18.23 








20.44 










18.32 


18.23 


o.os 


0.46 


20.38 


20.4 3 


0.05 


$ 


18.4 








20.68 










18.38 








20.87 






n £1 




18.46 


1U.42 


0.01 


n n 




20.73 


0.13 


6 


18.54 








21.09 










1 8.67 








21.04 






0.15 




19 


18.74 


0.21 


1.26 


21 .04 


21.06 


0.03 


7 


18.2B 








20.67 










18.36 








20.73 






0.2 




18.52 


18.39 


0.12 


0.66 


20.65 


20.68 


0.04 


8 


18.46 








20.98 










1B.7 








20.84 






0.57 




18.73 


18.63 


0.16 


0.83 


20.75 


20.86 


0.12 


9 


18.18 








20.46 










18.34 








20.54 






0.32 




18.26 


18.29 


0.1 


0.55 


20.48 


20.51 


0.07 


10 


18.42 








20.79 










18.57 








20.78 






0.16 




1 8.66 


18.55 


0.12 


0.65 


20.62 


20.73 


0.1 


Mean 


(1 10) 


18.-12 


0.17 


0.90 




20.66 


0.19 


0.94 



(or containing a partial cDNA for human factor 
VIII, pl-BTM. A scries of tnirtsfcclions wus sot 
up using a decreasing amount of the plasmid'(40, 
4, 0.5, and O.l Twr.niy-four hours posl- 

tr«»iflf«« iion, total DNA w«is purified from each 
flask uf trlb. p-Actin gene uuaulily wai chosen 
a value ftjr nni'm£li/-.aiit'i'i of ^coonm - . UNA con- 
ccntraUun hum each sample. In this cxpejime-nt, 
|l-actin gene content should' remain constam 
relative to total genomic UNA. Figure :i shows Hie 
result of (he p-actln DNA measurement (100 ng 
total DNA determined by ultraviolet spectros- 
copy) Ot each sample. Kach Simple was analyzed 
in triplicate and the mean |i-actin C,. values of 
the triplicates were plotted (error bars represent 

r»-»»..l«rri inn l 'Ihp hlntifsr <1ifforrnrr 



bi'twwmi any iwo samplct moans was 0.95 C,, Ten 
nanograms of total DNA of each sample were also 
examined for li-aclln. The results a^am showed 
that very similar amounts of genomic DNA were 
present; the. maximum mean (i aetin <":, value 
difference ws.s 1 .0. A3 1'igurc 3 shows, the ™ic of 
p. actio Cr change lx.Lwcen the 100 and 10-ng 
sample* was similar (slope values rang« bwtwoon 
3.56 and -3.45). This verifies again Owit ihi; 
method of sample preparation yields samplos of 
identical Pf.R integrity (i.o-, no sample contained 
an excessive amount of a VCR Inhibitor). How. 
ever, these results Indicate thut cneli sample con 
talned slight diffcienc.es in the adtinl amount of 
genomic DNA analysed. Determination of actual 
wununiic DNA concord ration was accomplished 
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Figure 2 Sample preparation purity. 1 he replicsto 
samples shown In Table 1 wore alw amplified In 
tripicate using 2S ng ol each DNA sample. The fig- 
uie shows the input DNA concentration (100 arid 
25 ng) vs. C, In ihf lirjurp. ih<» 1O0 and ?f> ng 
points lor «ach sample are connected by a line. 



by plotting the mean B-actio C, value obtained 
for each lOOny sainplv w •> p-ac lln standard 
i.-urve (shown In J'ig- 40). The actual k«»«« i1c 
DNA concentration «>f each sirmpH-., «, was ob 
tallied by extrapolation t« the X-a*ii, 

Figure 4A shows the measured (f.u., n«>n. 
normalised) quantities of /actor V1JJ plasm id 
DNA (pP8TM) from each of the four transient cell 
transactions. Each reaction contained J00 tiff of 
total sample DNA (as determined by UV spectro* 
copy)- Vac ^ sample was analyze! in triplicate 



25 
J- W 
23 
ZZ- 

21 



20 



V- 27.73 j ijlWrll. 1 
|(«i»is-..a.s*« Fi- 1 



DpflTM tmnsfaoto d 
40 P9 

• ■ OS M0 
A 0.1 MO 



0.8 



1J» 



1.8 



log (ng Input ONA) 

Figure 3 Analysis uf lidiisfectcd crJI DNA quantity 
and purity. I he DNA preparations- or the (our 293 
cell transections (40, A, 0.5, and 0.1 u.g of pF8TM) 
were analyzed for the 6-actln gene. 100 and 1 0 ng 
(determined by ultraviolet spectroscopy) of each 
sample were amplified in triplicate. For each 
amount of pF8TM that was transferred, the |i-actln 
C T values are plotted versus the total Input DNA 



|>f.U -.nnplifieatiori*. As shown,' pl ! 8TM purified 
-hoicJbe Z0A cells decreases (mean C, values in- 
with decreasing amounts of pla.<mld 
.truitslrUcd. The mean C t values obtained for 
pFWTW in'Wgurc 4A were plotted on a standard 
curve comprised of scdally diluted pl-'KTM, 
shown .in figure 4B. The quantity ol pl-XTM, h, 
found in each of the four transections was de- 
termined by extrapolation to tho * axis of tlio 
siandard curve In l'iguro 4». 'Hittsc uncorrected 
values, b, for pWiTM were nonnalbAid to deter- 
mine the actual amount of pl'STM found I wr 
rin of genomic DNA by using the equation:. 

/> x 100 lift (jciual pI'friTvl copies per 
jj r 100 ng of genomic DNA 

where a •- actual genomic DNA in u sample and 
b >- pl : B'l"M copies from the standard curve. 'Hie 
normalised quantity of pl'8'i'M per 100 ng of ge- 
nomic DNA for each Of the four transfer:! Ions Is 
snown In Hgure 4JJ. 'ilic-w. rrsulls .shew mat the 
quantity of factor Vlll ptasiuld associated wiili 
the 29.1 cells, 24 lir after iru»isfuc.li««i. duci eases 
with dccrcasluf; pJasniuJ i.«iiii.«iuiatiou used in 
the trana/ertion. Tin: quantity of pi'«'J'M associ- 
ated with 293 cells, after trunsfectlon with 40 ixg 
of plunjnid, was 35 pg p«r 100 ng genomic DNA. 
Tills results in -520 plasmid copies per cell. 



DISCUSSION 

We have described a new method for qunntii»t- 
iii}> gene copy numbers using real-time analysts 
of i'LK ampllficatltms. Real-time PCK is compat- 
ible with cither »i the tWO PCK (KT-PCR) ap- 
proaches: (1) quantitative competitive where an 
internal competitor l'or each target sequence is 
used for normaliKQdon (data not shown) or (2) 
quantitative comparative PCR using a uuiuwliiM- 
rion gene contained within the sample (i.e., |3-nc- 
tin) or a "housekeeping" gene for RT-PCK. If 
equal amounts of nucleic uc)d are anaiy/wl for 
e.at:H sample and if the amplification cfhuuK-.y 
before quantitative analysis is identical for each 
sample, the vrircrnul cunbtil (nwimnli^nion gene 
or competitor) should Rive equal signals for all 
Siiinplcs. 

Tlie real-time PCU method <jffcrs several ad- 
vamages over the other two methods currently 
employed (see the Introduction). I'irst, the. real- 
time PCR method Is performed in a doscd-tube 
system and requires no post-POl manipulation 
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Figure 4 Quantitative flnolyii< of pF8TM in transfccied cclli. (A) Amount of 
plasmid DNA uacd for I he trunsfectlon plotted against the mean C-'i value deter- 
mined for pf8TM remaining hr ailcr transection. (C/Q Standard curves of 
ptftTM and P-actIr», respectively. pl/QTM DNA (fl) and genomic. DNA (Q were 
dilutttd £ Aridity 1 :S before amplification with ihc appropriate primers. The (i-actin 
standard cuivd way usod lo normalise the results of A to 100 ny of genomic DNA. 
(0) Tho amount of pFSTM present per 100 ng of genomic DNA. 



of sample. Therefore, I In- potential for I'Cll con- 
I .mil nation in the laboratory is reduced because 
amplified products can (»« analysed and disposed 
of without opening thi' roast ion tubes. Second, 
(his method suppoiU the uw of a iiuriiiii1iy.iitk>i] 
gene, (i.e., fj-oclin) for quantitative. PCR or house- 
keeping genes for cjoantitntWc RT-1'CK controls. 
Analysis Is performed in real lime during the Jog 
phase of product accumulation. Analysis during 
kin phase permits many different genes (over a 
wide input target range) to be analyzed simulta- 
neously, without concern of reach) Jig reaction 
plateau at different cycles. This will make imilll- 
ge.iu-. analysis assays much caMe.i t\< develop, be- 
cause individual internal i.uiiipelllots will not be 
needed for ench gene ujidcr analysis- Third, 
.->cu»plc throughput will iuiieasc dianmlically 
with the new method because, there is no post- 
)*CK procc.-ising lime. Additionally, winking In a 
iJ6-we.ll formal is highly compatible with auto- 
million technology, 

The real-time PCR method is highly repro. 
duciblv. Replicate, amplifications can be analysed 



for pach sample minimizing i>olcnil»l error. Tho. 
.sysuim allows lor a very Jorge assay dynamic 
runge (approaching 1,000,000-fold starting till- 
got). Ualng u .standard curve for the, target ol in- 
terest, relative copy number values can be deter- 
mined for any uiiknuwu .■sample. Fluorescent 
threshold values, C r , cojitdutr. linearly with rela- 
tive DNA copy numbers. Heal time quantitative 
HT' l'CH methodology (Gibson et al., dils Imiih) 
has also been d«ve;lopcd. Finally, real time quan- 
titative I'CIl methodology can be used to devdup 
high-throughput screening assays for n variety of 
applications [quantitative gciK: capjessiuu (KT- 
rCH), gene copy assays (fieri, HIV, etc.), gCJUV 
typlng (knockout inouw analysis), and Immuno- 
PCHJ. 

Re.al-time K'.tt may alio be j^erformcd using 
intcrvnlnting dyes (Hlgttchi cl ul. 1"W.) such us 
efJvldium bromide. The fluorogenic prohe 
method offers a ma|or advantage over inter- 
calating dyes- greater specificity (i.e., primer 
dimvrs and nonspecific. PCR products are. not d«- 
tfvted). 
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METHODS 

Generation of <t Plasmld Containing a r*«rtlol 
cDNA far Human Factor VIII 

Total RNA *»«» harvested (UNA™ 1 ' " f^"" ' M rc5t ' h,e •' 
J-rjendswood, TX) from cells. i.-.i»fccted with a faetur VIII 
expression vector, nC:lS2.Bv5>.&U (Knum el id. i u K6s Gor. 
man et al. 1900). A factor VIII partial el >NA wjui'iuv WflS 
K c-nt<rnird by ItT \'('M [CioneAniu W. iTth ItNA IT.U Kit 

(pan Ntuw-uv/s, l'hApplui) iiiosysicms, I'cwum <-«*y, t;A)] 

using Die I'CU primers IVfor »■«! l-flrev <|.rinwr sequences 
are shown below). The ampllcon was rcamplifird uSlnR 
modified l-flfof and Wrcv prlmcn inpix-mled will) ftowlll 
and H/wMI rcslrlctlon site sequences »t «Jiv .v «>d> m>» 
Cloned Into [Xil'.M- 3Z (IVonn-gn CUirp., Mutlisou, Wl). The 
rcsitlllrJK clone, pVSTM, was uscil lejr transient tfansfocilon 
oJ' JiW cells. 



Amplification of Target DNA ami Duecilon of 
Amplicon Factor VIII Plasmid DNA 

(pvtriisi) was (iin]iiifi«i wixii the pitmen ini<» 5'-<:<;<:- 
<rmc(;AAi;Au:njA<Aiic;Ti.V3' and i*rev .v-aaacci-- 

tlAGCXTrOGATtit.iTAC'iCi-.V.The rvncllun pioduved B 
up K:k product. The forwurd primer was designed lu uv 
ngnl/e u unique M'ipiciiic (mind In die 5' untranslated 
region <>f panrul pC132,tkZ5l> planum! unit therefore 
Uovs not laiiK'iUu unii amplify ihe human factor VIII 
gene. I'rimnre wore chosen with the avsistaiire of Ihe com. 
I>ulcr program Oliso "I." (Notional lliusciencct, Inc., Ply- 
mouth, MN). The luiman P-actln gene w « s amplified with 
Ilic primers fj-tu-lin f«>rward primer .VTCACOOAOAtrrfIT 
GCCCAT£TI'AC:c";A-3' and [i-actin reverse pernor V.CAC;. 

CC.CAACCX;fri(:Arit;r.(^AlCG-3'. The reaction pro- 
ciucca a 2V5 hp rC.it product. 

Amplification reactions (SO pJ) contained * DNA 
sample. Hlx IV.K Uuffc.r II (h H->). 200 n-M dATl\ dC.Tl', 
dGIT, and 400 |tM rillTI>, 4 \nu MgCI 7 , UnMs Ampll 
7<i</ r;NA polymerase. U.5 unit Ampfcrnse uracil rt-Riy- 
tai.iyluw (UNO), £0 pinole of ench faetot VIII |irlmei, tind 1S 
pi ui >li> <tf uuc.h |t actln pilmer. Tho icat'llmtf. m!m> t»mlaliic(l 
OIK Of tflf fp|lr>wliif{ dt'U'Ctlon prnl«>.<i (Kill hm nirh); 

I'Hjirwbr A'(i'AW)Ac:cri , <rj'c:cu(:(:T<iCriT(: , i"i v rcTc: , i'. 
GCCTT(TAMRA)p 3' aud p-nt-tin yroU: S' (FAM)ATGW:t :- 
X0'AMHA)CCCC(":ATC;tX:ATl":p-.T wlirrr p indioates 
phoiplmrylnlion nnd X Indicates a linker arm nucleotide. 
Reaction UiIk-s wrrv Mit:n>Amp Optical TuIks (pnrl ftlim- 
IhtNKOI 09.1.1, l'crldn Ulnwr) that wuw frosted tut \\>rUUi 
r.lmcr) to prevent light from /cflccllng, Tube cops were 
similar in MitToAitip linps hut specially desinned Lo pre- 
vent UrIh seutlvring. All «l lli<- I'CIt iUiii«iniiiibU.* were »«j>- 
,,IU:d by PH Applied lljosyntfiio (Cuter C'My, CX) except 
ihr furtor Vlll primers, wlik'lt wen- synthesized ill Ccuen 
iccli, Inc. (South So" Pranclsco, CA). I'rohe.v wit.- de-signed 
using the Oliyr' 4.0 software, folk>wln f ; giildc'lliies kuk- 

8CSIM! in tnc Moilfl 77fK) .Sequence IKttrtor liisiiiune;il 
manual. Briefly, probe T m sJimild he nl least J U C l)l(;her 
mat) the amifulliiK leni|.ieialiirc used durlnj; lliermiil ey- 
fhitgi primers shu\tltl not foiin sUblv duplexes' with Ihr 
probe. 

The Uiernxd ryellng conditions Included 2 mill at 
50 v t": and 10 min at 9S"C. 'Ilicrmal «.-ycling proceeded with 



reactions were perfontied in the Model 770t).Scqucnce IV- 
l«tor (l*E Applied Uiosystuuiv), wblfh conlsUis * Ocrw . 
Allip t't'Jtl Syslum VWO. lU;a*:llon ctniditioi<« wi-rt- pro. 
gruiltiituU on a K>w« M«einti.»l> V100 (Apple ■Ci.iniptiler, 
Santa Clara, cyv) linked tlirvtily to the Model 77(K> Sf- 
queiKV lXil»ctor. Ana'y«l» »>t data w»s alw.1 performed on 
the Mwlnwmh eompviter. ( '.ollnetlon and analyKlt coflware 
wn» develo|Wl at 10-: Appllet.1 Wosysluins. 

Tt an»fection ef Cells with Factor VIII Convtrutt 

J-Viur T17S flasks of 2V3 tells (XWX) CM. )57 - .t), a human 
felol Ulelney swspeiiKion cell line, wvrv uniwn lo 80% con- 
llnency and tranjfewd jtlWM. Cells wert' Rmwn In the 
tollnwlug mudlnt i0% HAM'S H12 without OUT, W*> l»w 
flucose lluJlK-txn's modlflcrl Kajijc- mvdiuni (UMIiM) wltli> 
otn glycine with sodium bicarbonate, 10% ieial Ixvvine 
sctuhi, 2 i mm L-KluU»riiiw, and 1% penirillin-slrcptomy- 
llii. The media was <Jwn(,'cd 30 mln the Iransfcc 

lion. pI'UTM DNA amounta of 40, A, 0^, .ind 0.1 y.y, were 
iiUiliKl i«> I..*) ml of b solution contalnlnR 0.i« m CiiO,; 
and 1 x IllU'liS. The four mixtures were left at room tfii<- 
|jc.n.t«re fr» to mln and Iheti adtlvtl rlnipwlst- lo tl»e cells. 
Tire n«»k> wi.-.iiu.uUilcd al 37°C and -VM- < ':O s for 24 hr, 
washed with PUS, .•>"«! ro.»uspe.ndcd In WtS, The round 
j«-nd«.sl cells were divided into »lii|u»t.i and UNA wftj es- 
trotited lumtcdluUly usinR IhvQIAuinp BI<m»I Kit (Qiagen. 
ajotsw»rtli, <.V>), wiis e.luled Into 200 ol 30 imU 

Trls-IICJ utpll 8.0. 
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ABSTRACT Wnt family members are critical to many 
developmental processes, and components of the Wnt signal- 
ing pathway have been linked to tumorigenesis in familial and 
sporadic colon carcinomas. Here we report the identification 
of two genes, W1SP-1 and WISP-2, that are up-regulated in the 
mouse mammary epithelial cell line CS7MG transformed by 
Wnt-1, but not by Wnt-4. Together with a third related gene, 
WISP-3, these proteins define a subfamily of the connective 
tissue growth factor family. Two distinct systems demon- 
strated WISP induction to be associated with the expression of 
Wnt-1. These included (i) CS7MG cells infected with a Wnt-1 

retroviral vector or expressing Wnt-1 under the control of a 

tetracylihe repressive promoter, and (//) Wnt-1 transgenic 

mice. The WISP-1 gene was localized to human chromosome 

8q24.1-8q24J. WISP-1 genomic DNA was amplified in colon 

cancer cell lines and in human colon tumors and its RNA 

overexpressed (2- to >30-fold) in 84% of the tumors examined 

compared with patient-matched normal mucosa. WISP-3 

mapped to chromosome 6q22-6q23 and also was overex- 
pressed (4- to > 40-fold) in 63% of the colon tumors analyzed. 

In contrast, WISP-2 mapped to human chromosome 20ql2- 

20ql3 and its DNA was amplified, but RNA expression was 

reduced (2- to >30-fold) in 79% of the tumors. These results 

suggest that the WISP genes may be downstream of Wnt-1 

signaling and that aberrant levels of WISP expression in colon 

cancer may play a role in colon tumorigenesis. 

Wnt-1 is a member of an expanding family of cysteine-rich, 
glycosylated signaling proteins that mediate diverse develop- 
mental processes such as the control of cell proliferation, 
adhesion, cell polarity, and the establishment of cell fates (1, 
2) Wnt-1 originally was identified as an oncogene activated by 
the insertion of mouse mammary tumor virus in virus-induced 
mammary adenocarcinomas (3, 4). Although Wnt-1 is not 
expressed in the normal mammary gland, expression of Wnt-1 
in transgenic mice causes mammary tumors (5). 

In mammalian cells, Wnt family members initiate signaling 
by binding to the seven-transmembrane spanning Frizzled 
receptors and recruiting the cytoplasmic protein Dishevelled 
(Dsh) to the cell membrane (1, 2, 6). Dsh then inhibits the 
kinase activity of the normally constitutively active glycogen 
synthase kinase-3/3 (GSK-30) resulting in an increase in 
(3-catenin levels. Stabilized /3-catenin interacts with the tran- 
scription factor TCF/Lefl, forming a complex that appears in 
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the nucleus and binds TCF/Lefl target DNA elements to 
activate transcription (7, 8). Other experiments suggest that 
the adenomatous polyposis coli (APC) tumor suppressor gene 
also plays an important role in Wnt signaling by regulating 
/3-catenin levels (9). APC is phosphorylated by GSK-3/3, binds 
to /3-catenin, and facilitates its degradation. Mutations in 
either APC or 0-catenin have been associated with colon 
carcinomas and melanomas, suggesting these mutations con- 
tribute to the development of these types of cancer, implicating 
the Wnt pathway in tumorigenesis (1). 

Although much has been learned about the Wnt signaling 
pathway over the past several years, only a few of the tran- 
scriptionally activated downstream components activated by 
Wnt have been characterized. Those that have been described 
cannot account for all of the diverse functions attributed to 
Wnt signaling. Among the candidate Wnt target genes are 
those encoding the nodal-related 3 gene, Xnr3, a member of 
the transforming growth factor (TGF)-/3 superfamily, and the 
homeobox genes, engrailed,goosecoid,twin{Xtwn), andsmmois 
(2). A recent report also identifies c-myc as a target gene of the 
Wnt signaling pathway (10). 

To identify additional downstream genes in the Wnt signal- 
ing pathway that are relevant to the transformed cell pheno- 
type, we used a PCR-based cDNA subtraction strategy, sup- 
pression subtractive hybridization (SSH) (11), using RNA 
isolated from C57MG mouse mammary epithelial cells and 
C57MG cells stably transformed by a Wnt-1 retrovirus. Over- 
expression of Wnt-1 in this cell line is sufficient to induce a 
partially transformed phenotype, characterized by elongated 
and refractile cells that lose contact inhibition and form a 
multilayered array (12, 13). We reasoned that genes differen- 
tially expressed between these two cell lines might contribute 
to the transformed phenotype. 

In this paper, we describe the cloning and characterization 
of two genes up-regulated in Wnt-1 transformed cells, WISP-1 
and WISP-2, and a third related gene, WISPS. The WISP genes 
are members of the CCN family of growth factors, which 
includes connective tissue growth factor (CTGF), Cyr61, and 
nov, a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH SSH was performed by using the PCR-Select cDNA 
Subtraction' Kit (CLONTECH). Tester double-stranded 

Abbreviations: TGF, transforming growth factor; CTGF, connective 
tissue growth factor; SSH, suppression subtractive hybridization; 
VWC, von Willebrand factor type C module. 
Data deposition: The sequences reported in this paper have been 
deposited in the Genbank database (accession nos. AF100777, 
AF100778, AF100779, AF100780, and AF100781). 
tTo whom reprint requests should be addressed, e-mail: diane@gene. 
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cDNA was synthesized from 2 fig of poly(A)* RNA isolated 
from the C57MG/Wnt-1 cell line and driver cDNA from 2 fig 
of poly(A) + RNA from the parent C57MG cells. The sub- 
tracted cDNA library was subcloned into a pGEM-T vector for 
further analysis. 

cDNA Library Screening. Clones encoding full-length 
mouse WISP-] were isolated by screening a AgtlO mouse 
embryo cDNA library (CLONTECH) with a 70-bp probe from 
the original partial clone 568 sequence corresponding to amino 
acids 128-169. Clones encoding full-length human W1SP-1 
were isolated by screening AgtlO lung and fetal kidney cDNA 
libraries with the same probe at low stringency. Clones en- 
coding full-length mouse and human WISP-2 were isolated by 
screening a C57MG/Wnt-1 or human fetal lung cDNA library 
with a probe corresponding to nucleotides 1463-1512. Full- 
length cDNAs encoding WISPS were cloned from human 
bone marrow and fetal kidney libraries. 

Expression of Human WISP RNA. PCR amplification of 
first-strand cDNA was performed with human Multiple Tissue 
cDNA panels (CLONTECH) and 300 fiM of each dNTP at 
94°C for 1 sec, 62°C for 30 sec, 72°C for 1 min, for 22-32 cycles. 
WISP and glyceraldehyde-3-phosphate dehydrogenase primer 
sequences are available on request. 

In Situ Hybridization. 33 P-labeled sense and antisense nbo- 
probes were transcribed from an 897-bp PCR product corre- 
sponding to nucleotides 601-1440 of mouse WISP-1 or a 
294-bp PCR product corresponding to nucleotides 82-375 of 
mouse WISP-2. All tissues were processed as described (40). 

Radiation Hybrid Mapping. Genomic DNA from each 
hybrid in the Stanford G3 and Genebridge4 Radiation Hybrid 
Panels (Research Genetics, Huntsville, AL) and human and 
hamster control DNAs were PCR-amplified, and the results 
were submitted to the Stanford or Massachusetts Institute of 
Technology web servers. 

Cell Lines, Tumors, and Mucosa Specimens. Tissue speci- 
mens were obtained from the Department of Pathology (Uni- 
versity of Pittsburgh) for patients undergoing colon resection 
and from the University of Leeds, United Kingdom. Genomic 
DNA was isolated (Qiagen) from the pooled blood of 10 
normal human donors, surgical specimens, and the following 
ATCC human cell lines: SW480, COLO 320DM, HT-29, 
WiDr, and SW403 (colon adenocarcinomas), SW620 (lymph 
node metastasis, colon adenocarcinoma), HCT 116 (colon 
carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
HM7 (a variant of ATCC colon adenocarcinoma cell line LS 
174T) DNA concentration was determined by using Hoechst 
dye 33258 intercalation f luorimetry. Total RNA was prepared 
by homogenization in 7 M GuSCN followed by centrifugation 
over CsCl cushions or prepared by using RNAzol. 

Gene Amplification and RNA Expression Analysis. Relative 
gene amplification and RNA expression of WISPs and c-m>>c in 
the cell lines, colorectal tumors, and normal mucosa were 
determined by quantitative PCR. Gene-specific primers and 
fluorogenic probes (sequences available on request) were 
designed and used to amplify and quantitate the genes. The 
relative gene copy number was derived by using the formula 
2 (aci) w here ACt represents" the difference in amplification 
cycles required to detect the WISP genes in peripheral blood 
lymphocyte DNA compared with colon tumor DNA or colon 
tumor RNA compared with normal mucosal RNA. The 
d-method was used for calculation of the SE of the gene copy 
number or RNA expression level. The W/S/>-specific signal was 
normalized to that of the glyceraldehyde-3-phosphate dehy- 
drogenase housekeeping gene. All TaqMan assay reagents 
were obtained from Perkin-Elmer Applied Biosystems. 

RESULTS 

Isolation of WISP-1 and WISP-2 by SSH. To identify Wnt- 
1-inducible genes, we used the technique of SSH using the 
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mouse mammary epithelial cell line C57MG and C57MG cells 
that stably express Wnt-1 (11). Candidate differentially ex- 
pressed cDNAs (1,384 total) were sequenced. Thirty-nine 
percent of the sequences matched known genes or homo- 
logues, 32% matched expressed sequence tags, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse transcription-PCR and 
Northern analysis were performed by using mRNA from the 
C57MG and C57MG/Wnt-1 cells. 

Two of the cDNAs, WISP-1 and WISP-2, were differentially 
expressed, being induced in the C57MG/Wnt-1 cell line, but 
not in the parent C57MG cells or C57MG cells overexpressing 
Wnt-4 (Fig. 1 A and B). Wnt-4, unlike Wnt-1, does not induce 
the morphological transformation of C57MG cells and has no 
effect on 0-catenin levels (13, 14). Expression of WISP-1 was 
up-regulated approximately 3-fold in the C57MG/Wnt-1 cell 
line and WISP-2 by approximately 5-fold by both Northern 
analysis and reverse transcription-PCR. 

An independent, but similar, system was used to examine 
WISP expression after Wnt-1 induction. C57MG cells express- 
ing the Wnt-1 gene under the control of a tetracycline- 
repressible promoter produce low amounts of Wnt-1 in the 
repressed state but show a strong induction of Wnt-1 mRNA 
and protein within 24 hr after tetracycline removal (8). The 
levels of Wnt-1 and WISP RNA isolated from these cells at 
various times after tetracycline removal were assessed by 
quantitative PCR. Strong induction of Wnt-1 mRNA was seen 
as early as 10 hr after tetracycline removal. Induction of WISP 
mRNA (2- to 6-fold) was seen at 48 and 72 hr (data not shown). 
These data support our previous observations that show that 
WISP induction is correlated with Wnt-1 expression. Because 
the induction is slow, occurring after approximately 48 hr, the 
induction of WISPs may be an indirect response to Wnt-1 
signaling. 

cDNA clones of human WISP-1 were isolated and the 
sequence compared with mouse WISP-1. The cDNA sequences 
of mouse and human WISP-1 were 1,766 and 2,830 bp in length, 
respectively, and encode proteins of 367 aa, with predicted 
relative molecular masses of -40,000 (M, 40 K). Both have 
hydrophobic N-terminal signal sequences, 38 conserved cys- 
teine residues, and four potential N-linked glycosylation sites 
and are 84% identical (Fig. 2A). 

Full-length cDNA clones of mouse and human WISP-2 were 
1,734 and 1,293 bp in length, respectively, and encode proteins 
of 251 and 250 aa, respectively, with predicted relative molec- 
ular masses of ~ 27,000 (Af r 27 K) (Fig. 2B). Mouse and human 
WISP-2 are 73% identical. Human WISP-2 has no potential 
N-linked glycosylation sites, and mouse WISP-2 has one at 
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Fig. 1. WISP-1 and WISP-2 are induced by Wnt-1, but not Wnt-4, 
expression in C57MG cells. Northern analysis of WISP-1 {A) and 
WISP-2 (B) expression in C57MG, C57MG/Wnt-1, and C57MG/ 
Wnt-4 cells. Poly(A) + RNA (2 jig) was subjected to Northern blot 
analysis and hybridized with a 70-bp mouse H75f-7-specific probe 
(amino acids 278-300) or a 190-bp HYS/V-specific probe (nucleotides 
1438-1627) in the 3' untranslated region. Blots were rehybridized with 
human 0-actin probe. 
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Fig 2 Encoded amino acid sequence alignment of mouse and 
human WISP-1 (A) and mouse and human WISP-2 (B). The potential 
signal sequence, insulin-like growth factor-binding protein (1GF-BP), 
WVC, thrombospondin (TSP), and C-terminal (CT) domains are 
underlined. 

position 197. WISP-2 has 28 cysteine residues that are con- 
served among the 38 cysteines found in WISP-1. 

Identification of WISP-3. To search for related proteins, we 
screened expressed sequence tag (EST) databases with the 
WISP-1 protein sequence and identified several ESTs as 
potentially related sequences. We identified a homologous 
protein that we have called WISP-3. A full-length human 
WISP-3 cDNA of 1371 bp was isolated corresponding to those 
ESTs that encode a 354-aa protein with a predicted molecular 
mass of 39,293. WISP-3 has two potential N-linked glycosyl- 
ation sites and 36 cysteine residues. An alignment of the three 
human WISP proteins shows that WISP-1 and WISP-3 are the 
most similar (42% identity), whereas WISP-2 has 37% identity 
with WISP-1 and 32% identity with WISP-3 (Fig. 3A). 

WISPs Are Homologous to the CTGF Family of Proteins. 
Human WISP-1, WISP-2, and WISP-3 are novel sequences; 
however, mouse WISP-1 is the same as the recently identified 
Elml gene. Elml is expressed in low, but not high, metastatic 
mouse melanoma cells, and suppresses the in vivo growth and 
metastatic potential of K-1735 mouse melanoma cells (15). 
Human and mouse WISP-2 are homologous to the recently 
described rat gene, rCop-1 (16). Significant homology (36- 
44%) was seen to the CCN family of growth factors. This family 
includes three members, CTGF, Cyr61, and the protoonco- 
gene nov. CTGF is a chemotactic and mitogen ic factor for 
fibroblasts that is implicated in wound healing and fibrotic 
disorders and is induced by TGF-0 (17). Cyr61 is an extracel- 
lular matrix signaling molecule that promotes cell adhesion, 
proliferation, migration, angiogenesis, and tumor growth (18, 
19). nov (nephroblastoma overexpressed) is an immediate 
early gene associated with quiescence and found altered in 
Wilms tumors (20). The proteins of the CCN family share 
functional, but not sequence, similarity to Wnt-1. All are 
secreted, cysteine-rich heparin binding glycoproteins that as- 
sociate with the cell surface and extracellular matrix. 

WISP proteins exhibit the modular architecture of the CCN 
family, characterized by four conserved cysteine-rich domains 
(Fig. 35) (21). The N-terminal domain, which includes the first 
12 cysteine residues, contains a consensus sequence (GCGC- 
CXXC) conserved in most insulin-like growth factor (1GF)- 
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Fig 3. (A) Encoded amino acid sequence alignment of human 
WISPs. The cysteine residues of WISP-1 and WISP-2 that are not 
present in WISP-3 are indicated with a dot. (B) Schematic represen- 
tation of the WISP proteins showing the domain structure and cysteine 
residues (vertical lines). The four cysteine residues in the VWC domain 
that are absent in WISP-3 are indicated with a dot. (C) Expression of 
WISP mRNA in human tissues. PCR was performed on human 
multiple-tissue cDNA panels (CLONTECH) from the indicated adult 
and fetal tissues. 

binding proteins (BP). This sequence is conserved in WISP-2 
and WISP-3, whereas WISP-1 has a glutamine in the third 
position instead of a glycine. CTGF recently has been shown 
to specifically bind IGF (22) and a truncated nov protein 
lacking the IGF-BP domain is oncogenic (23). The von Wil- 
lebrand factor type C module (VWC), also found in certain 
collagens and mucins, covers the next 10 cysteine residues, and 
is thought to participate in protein complex formation and 
oligomerization (24). The VWC domain of WISP-3 differs 
from all CCN family members described previously, in that it 
contains only six of the 10 cysteine residues (Fig. 3 A and B). 
A short variable region follows the VWC domain. The third 
module, the thrombospondin (TSP) domain is involved in 
binding to sulfated glycoconjugates and contains six cysteine 
residues and a conserved WSxCSxxCG motif first identified in 
thrombospondin (25). The C-terminal (CT) module contain- 
ing the remaining 10 cysteines is thought to be involved in 
dimerization and receptor binding (26). The CT domain is 
present in all CCN family members described to date but is 
absent in WISP-2 (Fig. 3 A and B). The existence of a putative 
signal sequence and the absence of a transmembrane domain 
suggest that WISPs are secreted proteins, an observation 
supported by an analysis of their expression and secretion from 
mammalian cell and baculovirus cultures (data not shown). 

Expression of WISP mRNA in Human Tissues. Tissue- 
specific expression of human WISPs was characterized by PCR 
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analysis on adult and fetal multiple tissue cDNA panels. 
WISP-1 expression was seen in the adult heart, kidney, lung, 
pancreas, placenta, ovary, small intestine, and spleen (Fig. 3C). 
Little or no expression was detected in the brain, liver, skeletal 
muscle, colon, peripheral blood leukocytes, prostate, testis, or 
thymus. WISP-2 had a more restricted tissue expression and 
was detected in adult skeletal muscle, colon, ovary, and fetal 
lung. Predominant expression of WlSP-3 was seen in adult 
kidney and testis and fetal kidney. Lower levels of WISP-3 
expression were detected in placenta, ovary, prostate, and 
small intestine. 

In Situ Localization of WISP-1 and WISP-2. Expression of 
WISP-1 and WISP-2 was assessed by in situ hybridization in 
mammary tumors from Wnt-1 transgenic mice. Strong expres- 
sion of WISP-1 was observed in stromal fibroblasts lying within 
the fibrovascular tumor stroma (Fig. 4 A-D). However, low- 
level WISP-1 expression also was observed focally within tumor 
cells (data not shown). No expression was observed in normal 
breast. Like WISP-1, WISP-2 expression also was seen in the 
tumor stroma in breast tumors from Wnt-1 transgenic animals 
(Fig. 4 E-H). However, WISP-2 expression in the stroma was 
in spindle-shaped cells adjacent to capillary vessels, whereas 
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FlG. 4. (/4,C,£,andG) Representative hematoxylin/eosin-stained 
images from breast tumors in Wnt-1 transgenic mice. The correspond- 
ing dark-field images showing WISP-1 expression are shown in B and 
D. The tumor is a moderately well-differentiated adenocarcinoma 
showing evidence of adenoid cystic change. At low power (A and B), 
expression of WISP-1 is seen in the delicate branching fibrovascular 
tumor stroma (arrowhead). At higher magnification, expression is seen 
in the stromal(s) fibroblasts (C and D), and tumor cells are negative. 
Focal expression of WISP-1, however, was observed in tumor cells in 
some areas. Images of WISP-2 expression are shown in E-H. At low 
power (E and F), expression of WISP-2 is seen in cells lying within the 
fibrovascular tumor stroma. At higher magnification, these cells 
appeared to be adjacent to capillary vessels whereas tumor cells are 
negative (G and H). 
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the predominant cell type expressing WISP-1 was the stromal 
fibroblasts. 

Chromosome Localization of the WISP Genes. The chro- 
mosomal location of the human WISP genes was determined 
by- radiation hybrid mapping panels. WISP-1 is approximately 
3.48 cR from the meiotic marker AFM259xc5 [logarithm of 
odds (lod) score 16.31] on chromosome 8q24.1 to 8q24.3, in the 
same region as the human locus of the novH family member 
(27) and roughly 4 Mbs distal to c-myc (28). Preliminary fine 
mapping indicates that WISP-1 is located near D8S1712 STS. 
WISP-2 is linked to the marker SHGC-33922 (lod = 1,000) on 
chromosome 20ql2-20ql3.1. Human WISP-3 mapped to chro- 
mosome 6q22-6q23 and is linked to the marker AFM211ze5 
(lod = 1,000). WISPS is approximately 18 Mbs proximal to 
CTGF and 23 Mbs proximal to the human cellular oncogene 
MYB (27, 29). 

Amplification and Aberrant Expression of WISPs in Human 
Colon Tumors. Amplification of protooncogenes is seen in 
many human tumors and has etiological and prognostic sig- 
nificance. For example, in a variety of tumor types, c-myc 
amplification has been associated with malignant progression 
and poor prognosis (30). Because WISP-1 resides in the same 
general chromosomal location (8q24) as c-myc, we asked 
whether it was a target of gene amplification, and, if so, 
whether this amplification was independent of the c-myc locus. 
Genomic DNA from human colon cancer cell lines was 
assessed by quantitative PCR and Southern blot analysis. (Fig. 
5 A and B). Both methods detected similar degrees of WISP-1 
amplification. Most cell lines showed significant (2- to 4-fold) 
amplification, with the HT-29 and WiDr cell lines demonstrat- 
ing an 8-fold increase. Significantly, the pattern of amplifica- 
tion observed did not correlate with that observed for c-myc, 
indicating that the c-myc gene is not part of the amplicon that 
involves the WISP-1 locus. 

We next examined whether the WISP genes were amplified 
in a panel of 25 primary human colon adenocarcinomas. The 
relative WISP gene copy number in each colon tumor DNA 
was compared with pooled normal DNA from 10 donors by 
quantitative PCR (Fig. 6). The copy number of WISP-1 and 
WISP-2 was significantly greater than one, approximately 
2-fold for WISP-1 in about 60% of the tumors and 2- to 4-fold 
for WISP-2 in 92% of the tumors (P < 0.001 for each). The 
copy number for WISP-3 was indistinguishable from one (P = 
0 166). In addition, the copy number of WISP-2 was signifi- 
cantly higher than that of WISP-1 (P < 0.001). 

The levels of WISP transcripts in RNA isolated from 19 
adenocarcinomas and their matched normal mucosa were 
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Fig. 5. Amplification of WISP-1 genomic DNA in colon cancer cell 
lines. (A) Amplification in cell line DNA was determined by quanti- 
tative PCR. (B) Southern blots containing genomic DNA (10 n%) 
digested with EcoRl (WISP-1) or Xba\ (c-myc) were hybridized with 
a 100-bp human WISP-1 probe (amino acids 186-219) or a human 
c-myc probe (located at bp 1901-2000). The WISP and myc genes are 
detected in normal human genomic DNA after a longer film exposure. 
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Fig. 6. Genomic amplification of WISP genes in human colon 
tumors. The relative gene copy number of the WISP genes in 25 
adenocarcinomas was assayed by quantitative PCR, by comparing 
DNA from primary human tumors with pooled DNA from 10 healthy 
donors. The data are means * SEM from one experiment done in 
triplicate. The experiment was repeated at least three times. 

assessed by quantitative PCR (Fig. 7). The level of WISP-1 
RNA present in tumor tissue varied but was significantly 
increased (2- to >25-fold) in 84% (16/19) of the human colon 
tumors examined compared with normal adjacent mucosa. 
Four of 19 tumors showed greater than 10-fold overexpression. 
In contrast, in 79% (15/19) of the tumors examined, WISP-2 
RNA expression was significantly lower in the tumor than the 
mucosa. Similar to WISP-1, WISP-3 RNA was overexpressed in 
63% (12/19) of the colon tumors compared with the normal 
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Fig. 7. 'rf7Si° RNA expression in primary human colon tumors 
relative to expression in normal mucosa from the same patient. 
Expression of WISP mRNA in 19 adenocarcinomas was assayed by 
quantitative PCR. The Dukes stage of the tumor is listed under the 
sample number. The data are means ± SEM from one experiment 
done in triplicate. The experiment was repeated at least twice. 
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mucosa. The amount of overexpression of WISPS ranged from 
4- to >40-fold. 



DISCUSSION 

One approach to understanding the molecular basis of cancer 
is to identify differences in gene expression between cancer 
cells and normal cells. Strategies based on assumptions that 
steady-state mRNA levels will differ between normal and 
malignant cells have been used to clone differentially ex- 
pressed genes (31). We have used a PCR-based selection 
strategy, SSH, to identify genes selectively expressed in 
C57MG mouse mammary epithelial cells transformed by 
Wnt-1. 

Three of the genes isolated, WISP-1, WISP-2, and WISPS, 
are members of the CCN family of growth factors, which 
includes CTGF, Cyr61, and nov, a family not previously linked 
to Wnt signaling. 

Two independent experimental systems demonstrated that 
WISP induction was associated with the expression of Wnt-1. 
The first was C57MG cells infected with a Wnt-1 retroviral 
vector or C57MG cells expressing Wnt-1 under the control of 
a tetracyline-repressible promoter, and the second was in 
Wnt-1 transgenic mice, where breast tissue expresses. Wnt-1, 
whereas normal breast tissue does not. No WISP RNA expres- 
sion was detected in mammary tumors induced by polyoma 
virus middle T antigen (data not shown). These data suggest 
a link between Wnt-1 and WISPs in that in these two situations, 
WISP induction was correlated with Wnt-1 expression. 

It is not clear whether the WISPs are directly or indirectly 
induced by the downstream components of the Wnt-1 signaling 
pathway (i.e., j3-catenin-TCF-l/Lefl). The increased levels of 
WISP RNA were measured in Wnt-1 -transformed cells, hours 
or days after Wnt-1 transformation. Thus, WISP expression 
could result from Wnt-1 signaling directly through 0-catenin 
transcription factor regulation or alternatively through Wnt-1 
signaling turning on a transcription factor, which in turn 
regulates WISPs. 

The WISPs define an additional subfamily of the CCN family 
of growth factors. One striking difference observed in the 
protein sequence of WISP-2 is the absence of a CT domain, 
which is present in CTGF, Cyr61, nov, WISP-1, and WISP-3. 
This domain is thought to be involved in receptor binding and 
dimerization. Growth factors, such as TGF-0, platelet-derived 
growth factor, and nerve growth factor, which contain a cystine 
knot motif exist as dimers (32). It is tempting to speculate that 
WISP-1 and WISP-3 may exist as dimers, whereas WISP-2 
exists as a monomer. If the CT domain is also important for 
receptor binding, WISP-2 may bind its receptor through a 
different region of the molecule than the other CCN family 
members. No specific receptors have been identified for CTGF 
or nov. A recent report has shown that integrin avfo serves as 
an adhesion receptor for Cyr61 (33). 

The strong expression of WISP-1 and WISP-2 in cells lying 
within the fibrovascular tumor stroma in breast tumors from 
Wnt-1 transgenic animals is consistent with previous obser- 
vations that transcripts for the related CTGF gene are pri- 
marily expressed in the fibrous stroma of mammary tumors 
(34). Epithelial cells are thought to control the proliferation of 
connective tissue stroma in mammary tumors by a cascade of 
growth factor signals similar to that controlling connective 
. tissue formation during wound repair. It has been proposed 
that mammary tumor cells or inflammatory cells at the tumor 
interstitial interface secrete TGF-01, which is the stimulus for 
stromal proliferation (34). TGF-01 is secreted by a large 
percentage of malignant breast tumors and may be one of the 
growth factors that stimulates the production of CTGF and 
WISPs in the stroma. 

It was of interest that WISP-1 and WISP-2 expression was 
observed in the stromal cells that surrounded the tumor cells 
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(epithelial cells) in the Wnt-1 transgenic mouse sections of 
breast tissue. This finding suggests that paracrine signaling 
could occur in which the stromal cells could supply WISP-1 and 
W1SP-2 to regulate tumor cell growth on the WISP extracel- 
lular matrix. Stromal cell-derived factors in the extracellular 
matrix have been postulated to play a role in tumor cell 
migration and proliferation (35). The localization of WISP-1 
and W1SP-2 in the stromal cells of breast tumors supports this 
paracrine model. 

An analysis of WISP-1 gene amplification and expression in 
human colon tumors showed a correlation between DNA 
amplification and overexpression, whereas overexpression of 
WISPS RNA was seen in the absence of DNA amplification. 
In contrast, WISP-2 DNA was amplified in the colon tumors, 
but its mRNA expression was significantly reduced in the 
majority of tumors compared with the expression in normal 
colonic mucosa from the same patient. The gene for human 
WISP'2 was localized to chromosome 20ql2-20ql3, at a region 
frequently amplified and associated with poor prognosis in 
node negative breast cancer and many colon cancers, suggest- 
ing the existence of one or more oncogenes at this locus 
(36-38). Because the center of the 20ql3 amplicon has not yet 
been identified, it is possible that the apparent amplification 
observed for WISP-2 may be caused by another gene in this 
amplicon. 

A recent manuscript on rCop-1, the rat orthologue of 
WISP-2, describes the loss of expression of this gene after cell 
transformation, suggesting it may be a negative regulator of 
growth in cell lines (16). Although the mechanism by which 
WISP-2 RNA expression is down-regulated during malignant 
transformation is unknown, the reduced expression of WISP-2 
in colon tumors and cell lines suggests that it may function as 
a tumor suppressor. These results show that the WISP genes 
are aberrantly expressed in colon cancer and suggest that their 
altered expression may confer selective growth advantage to 
the tumor. 

Members of the Wnt signaling pathway have been impli- 
cated in the pathogenesis of colon cancer, breast cancer, and 
melanoma, including the tumor suppressor gene adenomatous 
polyposis coli and j3-catenin (39). Mutations in specific regions 
of either gene can cause the stabilization and accumulation of 
cytoplasmic 0-catenin, which presumably contributes to hu- 
man carcinogenesis through the activation of target genes such 
as the WISPs. Although the mechanism by which Wnt-1 
transforms cells and induces tumorigenesis is unknown, the 
identification of WISPs as genes that may be regulated down- 
stream of Wnt-1 in C57MG cells suggests they could be 
important mediators of Wnt-1 transformation. The amplifica- 
tion and altered expression patterns of the WISPs in human 
colon tumors may indicate an important role for these genes 
in tumor development. . 
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methods. Peptides AENK or AEQKwere dissolved in water, made isotonic with 
NaC! and diluted into RPM1 growth medium. T-ceU-proliferation assays were 
done essentially as described 20 - 1 '. Briefly, after antigen pulsing (30u.gmr' 
TTCF) with tetrapeptides (l-2mgmr'). PBMCs or EBV-B cells were 
washed in PBS and fixed for 45 s in 0.05% glutaraldehyde. Glycine was added 
to a final concentration of 0.1M and the cells were washed five times in RPMI 
1640 medium containing 1% FCS before co-culture with T-ceU clones in 
round-bottom 96-weU microtitre plates. After 48 h, the cultures were pulsed 
with 1 jiCiof'H-thymidine and harvested for scintillation counting 16 h later. 
Predigestion of native TTCF was done by incubating 200 u.g TTCF with 0.25 ag 
pig kidney legumain in 500 u.1 50 mM citrate buffer, pH 5.5, for 1 h at 37 °C. 
Glycopeptide digestions. The peptides H1DNEEDI, HlDN(N-glucosamine) 
EEDI and HIDNESD1, which are based on the TTCF sequence, and 
QQQHLFGSNVTDCSGNFCLFR(KKK), which is based on human transferrin, 
were obtained by custom synthesis. The three C-terminal lysine residues were 
added to the natural sequence to aid solubility. The transferrin glycopeptide 
QQQHLFGSNVTDCSGNFCLFR was prepared by tryptic (Promega) digestion 
of 5mg reduced, carboxy-methylated human transferrin followed by 
concanavalin A chromatography". Glycopeptides corresponding to residues 
622-642 and 421-452 were isolated by reverse-phase HPLC and identified by 
mass spectrometry and N-terminal sequencing. The lyophilized transferrin- 
derived peptides were redissolved in 50 mM sodium acetate, pH 5.5, 10 mM 
dithiothreitol, 20% methanol. Digestions were performed for 3 h at 30 °C with 
5-50 mU ml"' pig kidney legumain or B-cell AEP. Products were analysed by 
HPLC or MALD1-TOF mass spectrometry using a matrix of lOmgml"' a- 
cyanocinnamic acid in 50% acetonitrile/0. 1 % TFA and a PerSeptive Biosystems 
Elite STR mass spectrometer set to linear or reflector mode. Internal standar- 
dization was obtained with a matrix ion of 568.13 mass units. 
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Fas ligand (FasL) is produced by activated T cells and natural 
killer cells and it induces apoptosis (programmed cell death) in 
target cells through the death receptor Fas/Apol/CD95 (ref. 1). 
One important role of FasL and Fas is to mediate immune- 
cytotoxic killing of cells that are potentially harmful to the 
organism, such as virus-infected or tumour cells'. Here we 
report the discovery of a soluble decoy receptor, termed decoy 
receptor 3 (DcR3), that binds to FasL and inhibits FasL-induced 
apoptosis. The DcR3 gene was amplified in about half of 35 
primary lung and colon tumour* studied, and DcR3 messenger 
RNA was expressed in malignant tissue. Thus, certain tumours 
may escape FasL-dependent immune-cytotoxic attack by expres- 
sing a decoy receptor that blocks FasL 

By searching expressed sequence tag (EST) databases, we identi- 
fied a set of related ESTs that showed homology to the tumour 
necrosis factor (TNF) receptor (TNFR) gene superfamily 2 . Using 
the overlapping sequence, we isolated a previously unknown full- 
length complementary DNA from human fetal lung. We named the 
protein encoded by this cDNA decoy receptor 3 (DcR3). The cDNA 
encodes a 300-amino-acid polypeptide that resembles members of 
the TNFR family (Fig. la): the amino terminus contains a leader 
sequence, which is followed by four tandem cysteine-rich domains 
(CRDs). Like one other TNFR homologue, osteoprotegerin (OPG) 3 , 
DcR3 lacks an apparent transmembrane sequence, which indicates 
that it may be a secreted, rather than a membrane-asscociated, 
molecule. We expressed a recombinant, histidine-tagged form of 
DcR3 in mammalian cells; DcR3 was secreted into the cell culture 
medium, and migrated on polyacrylamide gels as a protein of 
relative molecular mass 35,000 (data not shown). DcR3 shares 
sequence identity in particular with OPG (31%) and TNFR2 
(29%), and has relatively less homology with Fas (17%). All of 
the cysteines in the four CRDs of DcR3 and OPG are conserved; 
however, the carboxy-terminal portion of DcR3 is 101 residues 
shorter. 

We analysed expression of DcR3 mRNA in human tissues by 
northern blotting (Fig. lb). We detected a predominant 1.2-kilobase 
transcript in fetal lung, brain, and liver, and in adult spleen, colon 
and lung. In addition, we observed relatively high DcR3 mRNA 
expression in the human colon carcinoma cell line SW480. 

To investigate potential ligand interactions of DcR3, we generated 
a recombinant, Fc-tagged DcR3 protein. We tested binding of 
DcR3-Fc to human 293 cells transfected with individual TNF- 
family ligands, which are expressed as type 2 transmembrane 
proteins (these transmembrane proteins have their N termini in 
the cytosol). DcR3-Fc showed a significant increase in binding to 
cells transfected with FasL 4 (Fig. 2a), but not to cells transfected with 
TNF 5 , Apo2L/TRAIL w , Apo3L/TWEAK*' 9 , or OPGL/TRANCE/ 
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RANKL 10 " 13 (data not shown). DcR3-Fc immunoprecipitated shed 
FasL from FasL-transfected 293 cells (Fig. 2b) and purified soluble 
FasL (Fig. 2c), as did the Fc-tagged ectodomain of Fas but not 
TNFR1. Gel-filtration chromatography showed that DcR3-Fc and 
soluble FasL formed a stable complex (Fig. 2d). Equilibrium 
analysis indicated that DcR3-Fc and Fas-Fc bound to soluble 
FasL with a comparable affinity (K d = 0.8 ± 0.2 and 
l.l±0.1nM, respectively; Fig. 2e), and that DcR3-Fc could 
block nearly all of the binding of soluble FasL to Fas-Fc (Fig. 2e, 
inset). Thus, DcR3 competes with Fas for binding to FasL. 

To determine whether binding of DcR3 inhibits FasL activity, we 
tested the effect of DcR3-Fc on apoptosis induction by soluble 
FasL in Jurkat T leukaemia cells, which express Fas (Fig. 3a). DcR3- 
Fc and Fas-Fc blocked soluble-FasL-induced apoptosis in a 
similar dose-dependent manner, with half-maximal inhibition at 
-0.1 u-gml" 1 . Time-course analysis showed that the inhibition did 
not merely delay cell death, but rather persisted for at least 24 hours 
(Fig. 3b). We also tested the effect of DcR3-Fc on activation- 
induced cell death (AICD) of mature T lymphocytes, a FasL- 
dependent process'. Consistent with previous results' 3 , activation 
of interleukin-2-stimulated CD4-positive T cells with anti-CD3 
antibody increased the level of apoptosis twofold, and Fas-Fc 
blocked this effect substantially (Fig. 3c); DcR3-Fc blocked the 



induction of apoptosis to a similar extent. Thus, DcR3 binding 
' blocks apoptosis induction by FasL. 

FasL-induced apoptosis is important in elimination of virus- 
infected cells and cancer cells by natural killer cells and cytotoxic T 
lymphocytes; an alternative mechanism involves perforin and 
granzymes' 14 ""'. Peripheral blood natural killer cells triggered 
marked cell death in Jurkat T leukaemia cells (Fig. 3d); DeR3-Fc 
and Fas-Fc each reduced killing of target cells from ~65% to 
~30%, with half-maximal inhibition at -liigml"'; the residual 
killing was probably mediated by the perforin/granzyme pathway. 
Thus, DcR3 binding blocks FasL-dependent natural killer cell 
activity. Higher DcR3-Fc and Fas-Fc concentrations were required 
to block natural killer cell activity compared with those required to 
block soluble FasL activity, which is consistent with the greater 
potency of membrane-associated FasL compared with soluble 
FasL". 

Given the role of immune-cytotoxic cells in elimination of 
tumour cells and the fact that DcR3 can act as an inhibitor of 
FasL, we proposed that DcR3 expression might contribute to the 
ability of some tumours to escape immune-cytotoxic attack. As 
genomic amplification frequently contributes to tumorigenesis, we 
investigated whether the DcR3 gene is amplified in cancer. We 
analysed DcR3 gene-copy number by quantitative polymerase chain 
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Figure 1 Primary structure and expression of human DcR3. a. Alignment ol the 
amino-acid sequences of DcR3 and of osteoprotegerin (OPG); the C-terminal 101 
residues of OPG are not shown. The putative signal cleavage site (arrow), the 
cysteine-rich domains (CRD 1-4), and the M-linked glycosylation site (asterisk) are 
shown, b. Expression of DcR3 mRNA. Northern hybridization analysis was done 
using the DcR3 cDNA as a probe and blots of poly(A)' RNA (Clontech) from 
human fetal and adult tissues or cancer cell lines. PBL. peripheral blood 
lymphocyte. 
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Figure 2 Interaction of DcR3 with FasL. a. 293 cells were transfected with pRK5 
vector (top) or with pRK5 encoding full-length FasL (bottom), incubated with 
DcR3-Fc (solid line, shaded area), TNFR1 -Fc (doned line) or buffer control 
(dashed line) (the dashed and dotted lines overlap), and analysed for binding by 
FACS. Statistical analysis showed a significant difference (P < 0.001 ) between the 
binding of DcR3-Fc to cells transfected with FasL or pRK5. PE. phycoerythrin- 
labelled cells, b, 293 cells were transfected as in a and metabolically labelled, and 
cell supernatants were immunoprecipitated with Fc-tagged TNFR1, DcR3 or Fas. 
c, Purified soluble FasL (sFasL) was immunoprecipitated with TNFR l -Fc. DcR3- 
Fc or Fas-Fc and visualized by immunoblot with anti-FasL antibody. sFasL was 
loaded directly for comparison in the right-hand lane, d, Flag-tagged sFasL was 
incubated with DcR3-Fc or with buffer and resolved by gel filtration; column 
fractions were analysed in an assay that detects complexes containing DcR3-Fc 
and sFasL-Flag. e. Equilibrium binding of DcR3-Fc or Fas-Fc to sFasL-Flag. 
Inset, competition of DcR3-Fc with Fas-Fc for binding to sFasL-Flag. 
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reaction (PCR) 1 " in genomic DNA from 35 primary lung and colon 
tumours, relative to pooled genomic DNA from peripheral blood 
leukocytes (PBLs) of 10 healthy donors. Eight of 18 lung tumours 
and 9 of 17 colon tumours showed DcR3 gene amplification, 
ranging from 2- to 18-fold (Fig. 4a, b). To confirm this result, we 
analysed the colon tumour DNAs with three more, independent sets 
of DcR3-based PCR primers and probes; we observed nearly the 
same amplification (data not shown). 

We then analysed DcR3 mRNA expression in primary tumour 
tissue sections by in situ hybridization. We detected DcR3 expres- 
sion in 6 out of 15 lung tumours, 2 out of 2 colon tumours, 2 out of 5 
breast tumours, and 1 out of 1 gastric tumour (data not shown). A 
section through a squamous-cell carcinoma of the lung is shown in 
Fig. 4c. DcR3 mRNA was localized to infiltrating malignant epithe- 
lium, but was essentially absent from adjacent stroma, indicating 
tumour-specific expression. Although the individual tumour speci- 
mens that we analysed for mRNA expression and gene amplification 
were different, the in situ hybridization results are consistent with 
the finding that the DcR3 gene is amplified frequently in tumours. 
SW480 colon carcinoma cells, which showed abundant DcR3 
mRNA expression (Fig. lb), also had marked DcR3 gene amplifica- 
tion, as shown by quantitative PCR (fourfold) and by Southern blot 
hybridization (fivefold) (data not shown). 

If DcR3 amplification in cancer is functionally relevant, then 
DcR3 should be amplified more than neighbouring genomic 
regions that are not important for tumour survival. To test this, 




Time (h) Inhibitor (ug ml- 1 ) 



Figure 3 Inhibition of FasL activity by DcR3. a, Human Jurkat T leukaemia cells 
were incubated with Flag-tagged soluble FasL (sFasL:.5ngmr') oligomerized 
with anti-Flag antibody (0.1 u.gmr') in the presence of the proposed inhibitors 
DcR3-Fc. Fas-Fc or human IgGI and assayed for apoptosis (mean s s.e.m. of 
triplicates), b. Jurkat cells were incubated with sFasL-Flag.plus anti-Flag antibody 
as in a. in presence of 1 u.g ml"' DcR3-Fc (filled circles). Fas-Fc (open circles) or 
human IgGl (triangles), and apoptosis was determined at the indicated time 
points, c. Peripheral blood T cells were stimulated with PHA and interleukin-2. 
followed by control (white bars) or anti-CD3 antibody (filled bars), together with 
phosphate-buffered saline (PBS), human IgGI. Fas-Fc. or DcR3-Fc (10 ug mr 1 ). 
After 16 h, apoptosis of CD4* cells was determined (mean * s.e.m. of results from 
five donors), d. Peripheral blood natural killer cells were incubated with 6, Cr- 
labelled Jurkat cells in the presence of DcR3-Fc (filled circles). Fas-Fc (open 
circles) or human IgGl (triangles), and target-cell death was determined by 
release of 6, Cr (mean ± s.d. for two donors, each in triplicate)/ 




letters to nature 

we mapped the human DcR3 gene by radiation-hybrid analysis; 
DcR3 showed linkage to marker AFM218xe7 (Tl 60), which maps to 
chromosome position 20ql3. Next, we isolated from a bacterial 
artificial chromosome (BAC) library a human genomic clone that 
carries DcR3, and sequenced the ends of the clone's insert. We then 
determined, from the nine colon tumours that showed twofold or 
greater amplification of DcR3; the copy number of the DcR3- 
flanking sequences (reverse and forward) from the BAC, and of 
seven genomic markers that span chromosome 20 (Fig. 4d). The 
DcR3 -linked reverse marker showed an average amplification of 
roughly threefold, slightly less than the approximately fourfold 
amplification of DcR3; the other markers showed little or no 
amplification. These data indicate that DcR3 may be at the 'epi- 
centre' of a distal chromosome 20 region that is amplified in colon 
cancer, consistent with the possibility that DcR3 amplification 
promotes tumour survival. 

Our results show that DcR3 binds specifically to FasL and inhibits 
FasL activity. We did not detect DcR3 binding to several other TNF- 
ligand-family members; however, this does not rule out the possi- 
bility that DcR3 interacts with other ligands, as do some other 
TNFR family members, including OPG 2 ". . 

FasL is important in regulating the immune response; however, 
little is known about how FasL function is controlled. One mechan- 
ism involves the molecule cFLIP, which modulates apoptosis signal- 
ling downstream of Fas 20 . A second mechanism involves proteolytic 
shedding of FasL from the cell surface". DcR3 competes with Fas for 




Figure 4 Genomic amplification of DcR3 in tumours, a. Lung cancers, comprising 
eight adenocarcinomas (c, d. f. g. h, j, k, r), seven squamous-cell carcinomas (a, e. 
m, n, o, p, q), one non-small-cell carcinoma (b), one small-cell carcinoma (i), and 
one bronchial adenocarcinoma (I). The data are means £ s.d. of 2 experiments 
done in duplicate, b, Colon tumours, comprising 17 adenocarcinomas. Data are 
means ± s.e.m. of rive experiments done in duplicate, c. In situ hybridization 
analysis of DcR3 mRNA expression in a squamous-cell carcinoma of the lung. A 
representative bright-field image (left) and the corresponding dark-held image 
(right) show DcR3 mRNA over infiltrating malignant epithelium (arrowheads). 
Adjacent non-malignant stroma (S), blood vessel (V) and necrotic tumour tissue 
(N) are also shown, d. Average amplification of DcR3 compared with amplifica- 
tion of neighbouring genomic regions (reverse and forward, Rev and Fwd), the 
DcR3-linked marker T160, and other chromosome-20 markers, in the nine colon 
tumours showing DcR3 amplification of twofold or more (b). Data are from two 
experiments done in duplicate. Asterisk indicates P < 0.01 for a Student's f-test 
comparing each marker with DcR3. 
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FasL binding; hence, it may represent a third mechanism of 
extracellular regulation of FasL activity. A decoy receptor that 
modulates the function of the cytokine interleukin-1 has been 
described 21 . In addition, two decoy receptors that belong to the 
TNFR family, DcRl and DcR2, regulate the FasL-related apoptosis- 
inducing molecule Apo2L". Unlike DcRl and DcR2, which are 
membrane-associated proteins, DcR3 is directly secreted into the 
extracellular space. One other secreted TNFR-family member is 
OPG 1 , which shares greater sequence homology with DcR3 (31%) 
than do DcRl (17%) or DcR2 (19%); OPG functions as a third 
decoy for Apo2L". Thus, DcR3 and OPG define a new subset of 
TNFR-family members that function as secreted decoys to mod- 
ulate ligands that induce apoptosis. Pox viruses produce soluble 
TNFR homologues that neutralize specific TNF-family ligands, 
thereby modulating the antiviral immune response 2 . Our results 
indicate that a similar mechanism, namely, production of a soluble 
decoy receptor for FasL, may contribute to immune evasion by 
certain tumours. ^ 



Methods 

Isolation of DcR3 cDNA. Several overlapping ESTs in GenBank (accession 
numbers AA025672, AA025673 and W67560) and in Lifeseq™ (Incyte 
Pharmaceuticals; accession numbers 1339238, 1533571, 1533650, 1542861, 
1789372 and 2207027) showed similarity to members of the TNFR family. We 
screened human cDNA libraries by PCR with primers based on the region of 
EST consensus; fetal lung was positive for a product of the expected size. By 
hybridization to a PCR-generated probe based on the ESTs, one positive clone 
(DNA30942) was identified. When searching for potential alternatively spliced 
forms of DcR3 that might encode a transmembrane protein, we isolated 50 
more clones; the coding regions of these clones were identical in size to that of 
the initial clone (data not shown). 

Fc-fusion proteins (immunoadhesins). The entire DcR3 sequence, or the 
ectodomain of Fas or TNFR1, was fused to the hinge and Fc region of human 
IgGl, expressed in insect SF9 cells or in human 293 cells, and purified as 
described". 

Fluorescence-activated cell sorting (FACS) analysis. We transfected 293 
cells using calcium phosphate or Effectene (Qiagen) with pRK5 vector or pRK5 
encoding full-length human FasL* (2 u.g), together with pRK5 encoding CrmA 
(2p.g) to prevent ceU death. After 16h, the cells were incubated with 
biotinylated DcR3-Fc or TNFRl-Fc and then with phycoerythrin-conjugated 
streptavidin (GibcoBRL), and were assayed by FACS. The data were analysed by 
Kolmogorov-Smirnov statistical analysis. There was some detectable staining 
of vector-transfected cells by DcR3-Fc; as these ceUs express little FasL (data 
not shown), it is possible that DcR3 recognized some other factor that is 
expressed constitutively on 293 cells. 

Immunoprecipitation. Human 293 cells were transfected as above, and 
metabolically labelled with [ J5 S)cysteine and [ 35 S] methionine (0.5 mCi; 
Amersham). After 16h of culture in the presence of z-VAD-fmk (lOu-M), 
the medium was immunoprecipitated with DcR3-Fc, Fas-Fc or TNFRl-Fc 
(5 jig), followed by protein A-Sepharose (Repligen). The precipitates were 
resolved by SDS-PAGE and visualized on a phosphorimager (Fuji BAS2000). 
Alternatively, purified. Flag-tagged soluble FasL (1 p-g) (Alexis) was incubated 
with each Fc-fusion protein (1 p-g), precipitated with protein A-Sepharose, 
resolved by SDS-PAGE and visualized by immunoblotting with rabbit anti- 
FasL antibody (Oncogene Research). 

Analysis of complex formation. Flag-tagged soluble FasL (25u.g) w « 
incubated with buffer or with DcR3-Fc (40 pvg) for 1.5 h at 24 "C. The reaction 
was loaded onto a Superdex 200 HR 10/30 column ( Pharmacia) and developed 
with PBS; 0.6-ml fractions were collected. The presence of DcR3-Fc-FasL 
complex in each fraction was analysed by placing 100 u.1 aliquots into microtitre 
wells precoated with anti-human IgG (Boehringer) to capture DcR3-Fc, 
followed by detection with biotinylated anti-Flag antibody Bio M2 (Kodak) and 
streptavidin-horseradish peroxidase (Amersham). Calibration of the column 
indicated an apparent relative molecular mass of the complex of 420K (data not 
shown), which is consistent with a stoichiometry of two DcR3-Fc homodimers 
to two soluble FasL homotrimers. 
I Equilibrium binding analysis. Microtitre wells were coated with anti-human 




IgG, blocked with 2% BSA in PBS. DcR3-Fc or Fas-Fc was added, followed by 
serially diluted Flag-tagged soluble FasL. Bound ligand was detected with anti- 
Flag antibody as above. In the competition assay, Fas-Fc was immobilized as 
above, and the wells were blocked with excess IgGl before addition of Flag- 
tagged soluble FasL plus DcR3-Fc. 

T-cell AICD. CD3* lymphocytes were isolated from peripheral blood of 
individual donors using anti-CD3 magnetic beads (Miltenyi Biotech), 
stimulated with phytohaemagglutinin (PHA; 2 p-gmT') for 24 h, and cultured 
in the presence of interleukin-2 ( 100 U ml"') for 5 days. The cells were plated in 
wells coated with anti-CD3 antibody (Pharmingen) and analysed for apoptosis 
16 h later by FACS analysis of annexin-V-binding of CD4* cells". 
Natural killer cell activity. Natural killer cells were isolated from peripheral 
blood of individual donors using anti-CD56 magnetic beads (Miltenyi 
Biotech), and. incubated for 16h with sl Cr-loaded Jurkat cells at an effector- 
to-target ratio of 1:1 in the presence of DcR3-Fc, Fas-Fc or human IgGl. 
Target-cell death was determined by release of "Ci in effector-target co- 
cultures relative to release of 51 Cr by detergent lysis of equal numbers of Jurkat 
cells. 

Gene-amplification analysis. Surgical specimens were provided by J. Kern 
(lung tumours) and P. Quirke (colon tumours). Genomic DNA was extracted 
(Qiagen) and the concentration was determined using Hoechst dye 33258 
intercalation fluorometry. Amplification was determined by quantitative PCR" 
using a TaqMan instrument ( AB1). The method was validated by comparison of 
PCR and Southern hybridization data for the Myc and HER-2 oncogenes (data 
not shown). Gene-specific primers and fluorogenic probes were designed on 
the basis of the sequence of DcR3 or of nearby regions identified on a BAC 
carrying the human DcR3 gene; alternatively, primers and probes were based 
on Stanford Human Genome Center marker AFM218xe7 (T160), which is 
linked to DcR3 (likelihood score = 5.4), SHGC-36268 (T159), the nearest 
available marker which maps to -500 kilobases from T160, and five extra 
markers that span chromosome 20. The DcR3-specific primer sequences were 
5'-CTTCTTCGCGCACGCTG-3' and 5'-ATCACGCCGGCACCAG-3' and the 
fluorogenic probe sequence was 5 ' - (FAM - ACACG ATGCGTGCTCCAAGCAG 
AAp-(TAMARA), where FAM is 5' -fluorescein phosphoramidite. Relative 
gene-copy numbers were derived using the formula 2 (4CT1 , where. ACT is the 
difference in amplification cycles required to detect DcR3 in peripheral blood 
lymphocyte DNA compared to test DNA. 
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ABC transporters (also known as traffic ATPases) form a large 
family of proteins responsible for the translocation of a variety 
of compounds across membranes of both prokaryotes and 
eukaryotes 1 . The recently completed Escherichia coli genome 
sequence revealed that the largest family of paralogous £ coli 
proteins is composed of ABC transporters 2 . Many eukaryotic 
proteins of medical significance belong to this family, such as 
the cystic fibrosis transmembrane conductance regulator (CFTR), 
the P-glycoprotein (or multidrug-resistance protein) and the 
heterodimeric transporter associated with antigen processing 
(Tapl-Tap2). Here we report the crystal structure at 1.5 A resolu- 
tion of HisP, the ATP-binding subunit of the histidine permease, 
which is an ABC transporter from Salmonella typhimurium. We 
correlate the details of this structure with the biochemical, genetic 
and biophysical properties of the wild-type and several mutant 
HisP proteins. The structure provides a basis for understanding 
properties of ABC transporters and of defective CFTR proteins. 

ABC transporters contain four structural domains: two nudeo- 
tide-binding domains (NBDs), which are highly conserved 
throughout the family, and two transmembrane domains'. In 
prokaryotes these domains are often separate subunits which are 
assembled into a membrane-bound complex; in eukaryotes the 
domains are generally fused into a single polypeptide chain. The 
periplasmic histidine permease of S. typhimurium and £ coli"' 1 is a 
well-characterized ABC transporter that is a good model for this 
superfamily. It consists of a membrane-bound complex, HisQMP 2 , 
which comprises integral membrane subunits, HisQ and HisM, and 
two copies of HisP, the ATP.binding. subunit. HisP, which has 
properties intermediate between those of integral and peripheral 
membrane proteins', is accessible from both sides of the membrane, 
presumably by its interaction with HisQ and HisM*. The two HisP 
subunits form a dimer, as shown by their cooperativity in ATP 
hydrolysis 5 , the requirement for both subunits to be present for 
activity 8 , and the formation of a HisP dimer upon chemical cross- 
linking. Soluble HisP also forms a dimer 3 . HisP has been purified 
and characterized in an active soluble form 3 which can be recon- 
stituted into a fully active membrane-bound complex 8 . 

The overall shape of the crystal structure of the HisP monomer is 
that of an 'U with two thick arms (arm I and arm II); the ATP- 
binding pocket is near the end of arm I (Fig. 1). A six-stranded p- 
sheet (P3 and P8-P 12) spans both arms of the L, with a domain of a 
ct- plus P-type structure (pi, P2, P4-P7, al and ct2) on one side 
(within arm I) and a domain of mostly a-helices (a3-a9) on the 
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Figure 1 Crystal structure of HisP. a. View of the dimer along an axis 
perpendicular to its two-fold axis. The top and bottom of the dimer are suggested 
to face towards the periplasmic and cytoplasmic sides, respectively (see text). 
The thickness of arm II is about 25 A. comparable to that of membrane. n-Helices 
are shown in orange and p-sheets in green, b, View along the two-fold axis of the 
HisP dimer, showing the relative displacement of the monomers not apparent in 
a. The p-strands at the dimer interface are labelled, c. View of one monomer from 
the bottom of arm I, as shown in a. towards arm II, showing the ATP-binding 
pocket, a-c. The protein and the bound ATP are in 'ribbon' and 'ball-and-stick' 
representations, respectively. Key residues discussed in the text are indicated in 
c. These figures were prepared with MOLSCRIPT 2 ". N, amino terminus: C, C 
terminus. 
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Gene amplification is a common event in the progression of 
human cancers, and amplified oncogenes have been shown to 
have diagnostic, prognostic and therapeutic relevance. A 
kinetic quantitative polymerase-chain-reaction (PCR) method, 
based on fluorescent TaqMan methodology and a new instru- 
ment (ABI Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real-time, was used to quantify 
gene amplification in tumor DNA. Reactions are character- 
ized by the point during cycling when PCR amplification is still 
in the exponential phase, rather than the amount of PCR 
product accumulated after a fixed number of cycles. None of 
the reaction components is limited during the exponential 
phase, meaning that values are highly reproducible in reac- 
tions starting with the same copy number. This greatly 
improves the precision of DIMA quantification. Moreover, 
real-time PCR does not require post-PCR sample handling, 
thereby preventing potential PCR-product carry-over con- 
tamination; it possesses a wide dynamic range of quantifica- 
tion and results in much faster and higher sample throughput. 
The real-time PCR method, was used to develop and validate 
a simple and rapid assay for the detection and quantification 
of the 3 most frequently amplified genes (myc, ccndl and 
ert>B2) in breast tumors. Extra copies of myc, ccndl and ert>B2 
were observed in 10, 23 and 15%, respectively, of 108 breast- 
tumor DNA; the largest observed numbers of gene copies 
were 4.6, 18.6 and 15.1, respectively. These results correlated 
well with those of Southern blotting. The use of this new 
semi-automated technique will make molecular analysis of 
human cancers simpler and more reliable, and should find 
broad applications in clinical and research settings. Int. J. 
Cancer 78:661-666, 1998. 
© 1998 Wilcy-Liss. Inc. 

Gene amplification plays an important role in the pathogenesis 
of various solid tumors, including breast cancer, probably because 
over-expression of the amplified target genes confers a selective 
advantage. The first technique used to detect genomic amplification 
was cytogenetic analysis. Amplification of several chromosome 
regions, visualized either as extrachromosomal double minutes 
(dmins) or as integrated homogeneously staining regions (HSRs), 
are among the main visible cytogenetic abnormalities in breast 
tumors. Other techniques such as comparative genomic hybridiza- 
tion (CGH) (Kallioniemi el at., 1994) have also been used in broad 
searches for regions of increased DNA copy numbers in tumor 
cells, and have revealed some 20 amplified chromosome regions in 
breast tumors. Positional cloning efforts are underway to identify 
the critical gene(s) in each amplified region. To date, genes known 
to be amplified frequently in breast cancers include myc (8q24), 
ccndl ( 1 1 q 1 3), and erbB2 ( 1 7q 1 2-q2 1 ) (for review, see Bieche and 
Lidereau, 1995). 

Amplification of the myc, ccndl, and erf>B2 proto-oncogenes 
should have clinical relevance in breast cancer, since independent 
studies have shown that these alterations can be used to identify 
sub-populations with a worse prognosis (Berns el a!., 1992; 
Schuuring el al., 1992; Samon el al. 1987). Muss el a!. (1994) 
suggested that these gene alterations may also be useful for the 
prediction and assessment of the efficacy of adjuvant chemotherapy 
and hormone therapy. 

However, published results diverge both in terms of the fre- 
quency of these alterations and their clinical value. For instance, 
over 500 studies in 10 years have failed to resolve the controversy 



surrounding the link suggested by Slamon el al. (1987) between 
erbB2 amplification and disease progression. These discrepancies 
are partly due to the clinical, histological and ethnic heterogeneity 
of breast cancer, but technical considerations are also probably 
involved. 

Specific genes (DNA) were initially quantified in rumor cells by 
means of blotting procedures such as Southern and slot blotting. 
These batch techniques require large amounts of DNA (5-10 
ug/reaction) to yield reliable quantitative results. Furthermore, 
meticulous care is required at all stages of the procedures to 
generate blots of sufficient quality for reliable dosage analysis. 
Recently, PCR has proven to be a powerful tool for quantitative 
DNA analysis, especially with minimal starting quantities of tumor 
samples (small, early-stage tumors and formalin-fixed, paraffin- 
embedded tissues). 

Quantitative PCR can be performed by evaluating the amount of 
product either after a given number of cycles (end-point quantita- 
tive PCR) or after a varying number of cycles during the 
exponential phase (kinetic quantitative PCR). In the first case, an 
internal standard distinct from the target molecule is required to 
ascertain PCR efficiency. The method is relatively easy but implies 
generating, quantifying and storing an internal standard for each 
gene studied. Nevertheless, it is the most frequently applied 
method to date. 

One of the major advantages of the kinetic method is its rapidity 
in quantifying a new gene, since no internal standard is required (an 
external standard curve is sufficient). Moreover, the kinetic method 
has a wide dynamic range (at least 5 orders of magnitude), giving 
an accurate value for samples differing in their copy number. 
Unfortunately, the method is cumbersome and has therefore been 
rarely used. It involves aliquot sampling of each assay mix at 
regular intervals and quantifying, for each aliquot, the amplifica- 
tion product. Interest in the kinetic method has been stimulated by a 
novel approach using fluorescent TaqMan methodology and a new 
instrument (ABI Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real time (Gibson ei al., 1996; Heid el 
al., 1996). The TaqMan reaction is based on the 5' nuclease assay 
first described by Holland el al. (1991). The latter uses the 5' 
nuclease activity of Taq polymerase to cleave a specific fluorogenic 
oligonucleotide probe during the extension phase of PCR. The 
approach uses dual-labeled fluorogenic hybridization probes (Lee 
ei al, 1993). One fluorescent dye, co-valently linked to the 5' end 
of the oligonucleotide, serves as a reporter [FAM (i.e., 6-carboxy- 
fluorescein)] and its emission spectrum is quenched by a second 
fluorescent dye, TAMRA (i.e., 6-carboxy-tetramethyl-rhodamine) 
attached to the 3' end. During the extension phase of the PCR 
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cycle, the fluorescent hybridization probe is hydrolyzed by the 
5'-3' nucleolytic activity of DNA polymerase. Nuclease degrada- 
tion of the probe releases the quenching of FAM fluorescence 
emission, resulting in an increase in peak fluorescence emission. 
The fluorescence signal is normalized by dividing the emission 
intensity of the reporter dye (FAM) by the emission intensity of a 
reference dye (i.e., ROX, 6-carboxy-X-rhodamine) included in 
TaqMan buffer, to obtain a ratio defined as the Rn (normalized 
reporter) for a given reaction tube. The use of a sequence detector 
enables the fluorescence spectra of all 96 wells of the thermal 
cycler to be measured continuously during PCR amplification. 

The real-time PCR method offers several advantages over other 
current quantitative PCR methods (Celi et ai, 1994): (i) the 
probe-based homogeneous assay provides a real-time method for 
detecting only specific amplification products, since specific hybri- 
dation of both the primers and the probe is necessary to generate a 
signal; (it) the C, (threshold cycle) value used for quantification is 
measured when PCR amplification is still in the log phase of PCR 
product accumulation. This is the main reason why C, is a more 
reliable measure of the starting copy number than are end-point 
measurements, in which a slight difference in a limiting component 
can have a drastic effect on the amount of product; (Hi) use of C, 
values gives a wider dynamic range (at least 5 orders of magni- 
tude), reducing the need for serial dilution; (iv) The real-time PCR 
method is run in a closed-tube system and requires no post-PCR 
sample handling, thus avoiding potential contamination; (v) the 
system is highly automated, since the instrument continuously 
measures fluorescence in all 96 wells of the thermal cycler during 
PCR amplification and the corresponding software processes, and 
analyzes the fluorescence data; (vi) the assay is rapid, as results are 
available just one minute after thermal cycling is complete; (vii) the 
sample throughput of the method is high, since 96 reactions can be 
analyzed in 2 hr. 

Here, we applied this semi-automated procedure to determine 
the copy numbers of the 3 most frequently amplified genes in breast 
tumors (myc, ccndl and er6B2), as well as 2 genes (alb and app) 
located in a chromosome region in which no genetic changes have 
been observed in breast tumors. The results for 1 08 breast tumors 
were compared with previous Southem-blot data for the same 
samples. 



MATERIAL AND METHODS 
Tumor and blood samples 

Samples were obtained from 1 08 primary breast tumors removed 
surgically from patients at the Centre Rene Huguenin; none of the 
patients had undergone radiotherapy or chemotherapy. Immedi- 
ately after surgery, the tumor samples were placed in liquid 
nitrogen until extraction of high-molecular-weight DNA. Patients 
were included in this study if the tumor sample used for DNA 
preparation contained more than 60% of tumor cells (histological 
analysis). A blood sample was also taken from 1 8 of the same 
patients. 

DNA was extracted from tumor tissue and blood leukocytes 
according to standard methods. 

Real-lime PCR 

Theoretical basis. Reactions are characterized by the point 
during cycling when amplification of the PCR product is first 
detected, rather than by the amount of PCR product accumulated 
after a fixed number of cycles. The higher the starting copy number 
of the genomic DNA target, the earlier a significant increase in 
fluorescence is observed. The parameter C, (threshold cycle) is 
defined as the fractional cycle number at which the fluorescence 
generated by cleavage of the probe passes a fixed threshold above 
baseline. The target gene copy number in unknown samples is 
quantified by measuring C, and by using a standard curve to 
determine the starting copy number. The precise amount of 
genomic DNA (based on optical density) and its quality (i.e., lack 



of extensive degradation) are both difficult to assess. We therefore 
also quantified a control gene (alb) mapping to chromosome region 
4qll-ql3. in which no genetic alterations have been found in 
breast-tumor DNA by means of CGH (Kallioniemi el ai, 1 994). 

Thus, the ratio of the copy number of the target gene to the copy 
number of the alb gene normalizes the amount and quality of 
genomic DNA. The ratio defining the level of amplification is 
termed "N", and is determined as follows: 

copy number of target gene (app, m)>c, ccndl, er6B2) 

N = . 

copy number of reference gene (alb) 

Primers, probes, reference human genomic DNA and PCR 
consumables. Primers and probes were chosen with the assistance 
of the computer programs Oligo 4.0 (National Biosciences, Ply- 
mouth, MN), EuGene (Daniben Systems, Cincinnati, OH) and Primer 
Express (Perkin-Elmer Applied Biosystems, Foster City, CA). 

Primers were purchased from DNAgency (Malvern, PA) and 
probes from Perkin-Elmer Applied Biosystems. 

Nucleotide sequences for the oligonucleotide hybridization 
probes and primers are available on request. 

The TaqMan PCR Core reagent kit, MicroAmp optical tubes, 
and MicroAmp caps were from Perkin-Elmer Applied Biosystems. 

Standard-curve construction. The kinetic method requires a 
standard curve. The latter was constructed with serial dilutions of 
specific PCR products, according to Piatak et al. (1993). In 
practice, each specific PCR product was obtained by amplifying 20 
rig of a standard human genomic DNA (Boehringer, Mannheim, 
Germany) with the same primer pairs as those used later for 
real-time quantitative PCR. The 5 PCR products were purified 
using MicroSpin S-400 HR columns (Pharmacia, Uppsala, Swe- 
den) electrophorezed through an acrylamide gel and stained with 
ethidium bromide to check their quality. The PCR products were 
then quantified spectrophotometrically and pooled, and serially 
diluted 1 0-fold in mouse genomic DNA (Clontech, Palo Alto, CA) 
at a constant concentration of 2 ng/ul. The standard curve used for 
real-time quantitative PCR was based on serial dilutions of the pool 
of PCR products ranging from 10" 7 (10 s copies of each gene) to 
10" 10 (10 2 copies). This series of diluted PCR products was 
aliquoted and stored at - 80°C until use. 

The standard curve was validated by analyzing 2 known 
quantities of calibrator human genomic DNA (20 ng and 50 hg). 

PCR amplification. Amplification mixes (50 ul) contained the 
sample DNA (around 20 ng, around 6600 copies of disomic genes), 
10X TaqMan buffer (5 ul), 200 uM dATP, dCTP, dGTP, and 400 
uM dUTP, 5 mM MgCI 2 , 1.25 units of AmpliTaq Gold, 0.5 units of 
AmpErase uracil N-glycosylase (UNG), 200 nM each primer and 
100 nM probe. The thermal cycling conditions comprised 2 min at 
50°C and 1 0 min at 95°C. Thermal cycling consisted of 40 cycles at 
95°C for 15 s and 65°C for 1 min. Each assay included: a standard 
curve (from 10 J to 10 2 copies) in duplicate, a no-template control, 
20 ng and 50 ng of calibrator human genomic DNA (Boehringer) in 
triplicate, and about 20 ng of unknown genomic DNA in triplicate 
(26 samples can thus be analyzed on a 96-well microplate). All 
samples with a coefficient of variation (CV) higher than 10% were 
retested. 

All reactions were performed in the ABI Prism 7700 Sequence 
Detection System (Perkin-Elmer Applied Biosystems), which 
detects the signal from the fluorogenic probe during PCR. 

Equipment for real-lime detection. The 7700 system has a 
built-in thermal cycler and a laser directed via fiber optical cables 
to each of the 96 sample wells. A charge-coupled-device (CDD) 
camera collects the emission from each sample and the data are 
analyzed automatically. The software accompanying the 7700 
system calculates C, and determines the starting copy number in the 
samples. 
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Determination of gene amplification. Gene amplification was 
calculated as described above. Only samples with an N value 
higher than 2 were considered to be amplified. 



RESULTS 

To validate the method, real-time PCR was performed on 
genomic DNA extracted from 108 primary breast tumors, and 18 
normal leukocyte DNA samples from some of the same patients. 
The target genes were the myc, ccndl and erbBl proto-oncogenes, 
and the B-amyloid precursor protein gene (app), which maps to a 
chromosome region (2 1 q2 1 .2) in which no genetic alterations have 
been found in breast tumors (Kallioniemi et al., 1994). The 
reference disomic gene was the albumin gene (alb, chromosome 
4qll-ql3). 



Validation of the standard curve and dynamic range 
of real-time PCR 

The standard curve was constructed from PCR products serially 
diluted in genomic mouse DNA at a constant concentration of 
2 ng/ul. It should be noted that the 5 primer pairs chosen to analyze 
the 5 target genes do not amplify genomic mouse DNA (data not 
shown). Figure 1 shows the real-time PCR standard curve for the 
alb gene. The dynamic range was wide (at least 4 orders of 
magnitude), with samples containing as few as 10 3 copies or as 
many as 10 s copies. 

Copy-number ratio of the 2 reference genes (app and alb) 

The app to alb copy-number ratio was determined in 1 8 normal 
leukocyte DNA samples and all 108 primary breast-tumor DNA 
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Figure 1 - Albumin (alb) gene dosage by real-time PCR. Top: Amplification plots for reactions with.starting alb gene copy number ranging 
from 10 ! (A9), 10* (A7), 10 3 (A4) to 10 2 (A2) and a no-template control (Al). Cycle number is plotted vs. change in normalized reporter signal 
(ARn). For each reaction tube, the fluorescence signal of the reporter dye (FAM) is divided by the fluorescence signal of the passive reference dye 
(ROX), to obtain a ratio defined as the normalized reporter signal (Rn). ARn represents the normalized reporter signal (Rn) minus the baseline 
signal established in the first 15 PCR cycles. ARn increases during PCR as alb PCR product copy number increases until the reaction reaches a 
plateau. C, (threshold cycle) represents the fractional cycle number at which a significant increase in Rn above a baseline signal (horizontal black 
line) can first be delected. Two replicate plots were performed for each standard sample, but the data for only one are shown here. Bottom: 
Standard curve plotting log starting copy number vs. C, (threshold cycle). The black dots represent the data for standard samples plotted in 
duplicate and the red dots the data for unknown genomic DNA samples plotted in triplicate. The standard curve shows 4 orders of linear dynamic 
range. 
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samples. We selected these 2 genes because they are located in 2 
chromosome regions (app. 21q21.2; alb, 4qll-ql3) in which no 
obvious genetic changes (including gains or losses) have been 
observed in breast cancers (Kallioniemi et ai, 1994). The ratio for 
the 18 normal leukocyte DNA samples fell between 0.7 and 1.3 
(mean 1.02 ± 0.21), and was similar for the 108 primary breast- 
tumor DNA samples (0.6 to 1 .6, mean 1 .06 ± 0.25), confirming 
that alb and app are appropriate reference disomic genes for 
breast-tumor DNA. The low range of the ratios also confirmed that 
the nucleotide sequences chosen for the primers and probes were 
not polymorphic, as mismatches of their primers or probes with the 
subject's DNA would have resulted in differential amplification. 

myc, ccndl and erbB2 gene dose in normal leukocyte DNA 

To determine the cut-off point for gene amplification in breast- 
cancer tissue, 18 normal leukocyte DNA samples were tested for 
the gene dose (N), calculated as described in "Material and 
Methods". The N value of these samples ranged from 0.5 to 1.3 
(mean 0.84 ± 0.22) for mvc, 0.7 to 1 .6 (mean 1 .06 ± 0.23) for 
ccndl and 0.6 to 1.3 (mean0.91 ±0.19) for erbBl. Since N values 
for myc, ccndl and erbBl in normal leukocyte DNA consistently 
fell between 0.5 and 1 .6, values of 2 or more were considered to 
represent gene amplification in tumor DNA. 

myc, ccndl and erbB2 gene dose in breast-tumor DNA 

myc, ccndl and erbBl gene copy numbers in the 108 primary 
breast tumors are reported in Table 1. Extra copies of ccndl were 
more frequent (23%, 25/108) than extra copies of erbBl (15%, 
16/108) and mvc (10%, 11/108), and ranged from 2 to 18.6 for 
ccndl, 1 to 15.1 for erbBl, and only 2 to 4.6 for the myc gene. 
Figure 2 and Table II represent tumors in which the ccndl gene was 
amplified 16-fold (T145), 6-fold (T133) and non-amplified (T118). 
The 3 genes were never found to be co-amplified in the same tumor. 
erbBl and ccndl were co-amplified in only 3 cases, myc and ccndl 
in 2 cases and myc and erbBl in 1 case. This favors the hypothesis 
that gene amplifications are independent events in breast cancer. 
Interestingly, 5 tumors showed a decrease of at least 50% in the 
erbBl copy number (N < 0.5), suggesting that they bore deletions 
of the 17q21 region (the site of erbBl). No such decrease in copy 
number was observed with the other 2 proto-oncogenes. 

Comparison of gene dose determined by real-time quantitative 
PCR and Southern-blot analysis 

Southem-blot analysis of myc, ccndl and erbB2 amplifications 
had previously been done on the same 1 08 primary breast tumors. A 
perfect correlation between the results of real-time PCR and 
Southern blot was obtained for tumors with high copy numbers 
(N 2: 5). However, there were cases (1 myc, 6 ccndl and 4 erbBl) 
in which real-time PCR showed gene amplification whereas 
Southem-blot did not, but these were mainly cases with low extra 
copy numbers (N from 2 to 2.9). 

DISCUSSION 

The clinical applications of gene amplification assays are 
currently limited, but would certainly increase if a simple, standard- 
ized and rapid method were perfected. Gene amplification status 
has been studied mainly by means of Southern blotting, but this 
method is not sensitive enough to detect low-level gene amplifica- 
tion nor accurate enough to quantify the full range of amplification 
values. Southern blotting is also time-consuming, uses radioactive 



TABLE 1 - DISTRIBUTION OF AMPLIFICATION LEVEL (N) FOR myc. 
ccndl AND crbB2 GENES IN 108 HUMAN BREAST TUMORS 



Gene 




Amplification level (N) 




<0.5 


0.5-1.9 


2-4.9 


25 


myc 

ccndl 

erbBl 


0 
0 

5 (4.6%) 


97 (89.8%) 
83 (76.9%) 
87 (80.6%) 


11 (10.2%) 
17(15.7%) 
8 (7.4%) 


0 

8(7.4%) • 
8 (7.4%) 



reagents and requires relatively large amounts of high-quality 
genomic DNA. which means it cannot be used routinely in many 
laboratories. An amplification step is therefore required to deter- 
mine the copy number of a given target gene from minimal 
quantities of tumor DNA (small early-stage tumors, cytopuncture 
specimens or formalin-fixed, paraffin-embedded tissues). 

In this study, we validated a PCR method developed for the 
quantification of gene over-representation in rumors. The method, 
based on real-time analysis of PCR amplification, has several 
advantages over other PCR-based quantitative assays such as 
competitive quantitative PCR (Celi el ai, 1 994). First, the real-time 
PCR method is performed in a closed-tube system, avoiding the 
risk of contamination by amplified products. Re-amplification of 
carryover PCR products in subsequent experiments can also be 
prevented by using the enzyme uracil N-glycosylase (UNG) 
(Longo et ai, 1990). The second advantage is the simplicity and 
rapidity of sample analysis, since no post-PCR manipulations are 
required. Our results show that the automated method is reliable. 
We found it possible to determine, in triplicate, the number of 
copies of a target gene in more than 1 00 tumors per day. Third, the 
system has a linear dynamic range of at least 4 orders of magnitude, 
meaning that samples do not have to contain equal starting amounts 
of DNA. This technique should therefore be suitable for analyzing 
formalin-fixed, paraffin-embedded tissues. Fourth, and above all, 
real-time PCR makes DNA quantification much more precise and 
reproducible, since it is based on C, values rather than end-point 
measurement of the amount of accumulated PCR product. Indeed, 
the ABI Prism 7700 Sequence Detection System enables C, to be 
calculated when PCR amplification is still in the exponential phase 
and when none of the reaction components is rate-limiting. The 
within-run CV of the C, value for calibrator human DNA (5 
replicates) was always below 5%, and the between-assay precision 
in 5 different runs was always below 10% (data not shown). In 
addition, the use of a standard curve is not absolutely necessary, 
since the copy number can be determined simply by comparing the 
C, ratio of the target gene with that of reference genes. The results 
obtained by the 2 methods (with and without a standard curve) are 
similar in our experiments (data not shown). Moreover, unlike 
competitive quantitative PCR, real-time PCR does not require an 
internal control (the design and storage of internal controls and the 
validation of their amplification efficiency is laborious). 

The only potential disavantage of real-time PCR. like all other 
PCR-based methods and solid-matrix blotting techniques (South- 
em blots and dot blots) is that is cannot avoid dilution artifacts 
inherent in the extraction of DNA from tumor cells contained in 
heterogeneous tissue specimens. Only FISH and immunohistochem- 
istry can measure alterations on a cell-by-cell basis (Pauletti et ai, 
1996; Slamon et ai. 1989). However, FISH requires expensive 
equipment and trained personnel and is also time-consuming. 
Moreover, FISH does not assess gene expression and therefore 
cannot detect cases in which the gene product is over-expressed in 
the absence of gene amplification, which will be possible in the 
future by real-time quantitative RT-PCR. Immunohistochemistry is 
subject to considerable variations in the hands of different teams, 
owing to alterations of target proteins during the procedure, the 
different primary antibodies and fixation methods used and the 
criteria used to define positive staining. 

The results of this study are in agreement with those reported in 
the literature, (i) Chromosome regions 4qll-ql3 and 21q21.2 
(which bear alb and app, respectively) showed no genetic alter- 
ations in the breast-cancer samples studied here, in keeping with 
the results of CGH (Kallioniemi et ai, 1994). (ii) We found that 
amplifications of these 3 oncogenes were independent events, as 
reported by other teams (Bems et ai, 1992; Borg et ai, 1992). (Hi) 
The frequency and degree of myc amplification in our breast tumor 
DNA series were lower than those of ccndl and erbBl amplifica- 
tion, confirming the findings of Borg etai (1992) and Courjal etal. 
(1997). (j'v) The maxima of ccndl and erbBl over-representation 
were 1 8-fold and 1 5-fold, also in keeping with earlier results (about 
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22.1 
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25.2 



25.6 
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Figure 2 - ccndl and alb gene dosage by real-time PCR in 3 breast tumor samples: Tl 1 8 (E12, C6, black squares), Tl 33 (G 1 1 , B4, red squares) 
and T145 (A8, C8. blue squares). Given the C, of each sample, the initial copy numberis inferred from the standard curve obtained during the same 
experiment. Triplicate plots were performed for each tumor sample, but the data for only one are shown here. The results are shown in Table II. 



30-fold maximum) (Bems et ai, 1992; Borg el a!., 1992; Courjal el 
ah, 1997). (v) The eri>B2 copy numbers obtained with real-time 
PCR were in good agreement with data obtained with other 
quantitative PCR-based assays in terms of the frequency and 
degree ofamplification (An era/., 1995; Deng et ai. 1996;Valeron 



el ai, 1996). Our results also correlate well with those recently 
published by Gelmini et ai ( 1 997), who used the TaqMan system to 
measure erbB2 amplification in a small series of breast tumors 
(n = 25), but with an instrument (LS-50B luminescence spectrom- 
eter, Perkin-Elmer Applied Biosystems) which only allows end- 
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TABLE II - EXAMPLES OF ccndl GENE DOSAGE RESULTS 
FROM 3 DREAST TUMORS' 



Tumor 




ccndl 






alb 




Nccndl/alb 


Copy 
number 


Mean 


SD 


Copy 
number 


Mean 


SD 


TU 8 


4525 






4223 










4605 


4603 


77 


4365 


4325 


89 


1.06 




4678 






4387 








T133 


59821 






9787 






6.03 




61659 


61100 


1111 


10092 


10137 


375 




61821 






10533 








T145 


128563 






7321 






16.34 




125892 


125392 


3448 


7762 


7672 


316 




121722 






7933 









'For each sample, 3 replicate experiments were performed and the mean 
and the standard deviation (SD) was determined. The level of ccndl gene 
amplification (Hccndllalb) is determined by dividing the average ccndl 
copy number value by the average alb copy number .value. 



point measurement of fluorescence intensity. Here we report myc 
and ccndl gene dosage in breast cancer by means of quantitative 
PCR. (vi) We found a high degree of concordance between 
real-time quantitative PCR and Southern blot analysis in terms of 
gene amplification, especially for samples with high copy numbers 
(>5-fold). The slightly higher frequency of gene amplification 
(especially ccndl and erb&2) observed by means of real-time 
quantitative PCR as compared with Southem-blot analysis may be 
explained by the higher sensitivity of the former method. However, 
we cannot rule out the possibility that some rumors with a few extra 



gene copies observed in real-time PCR had additional copies of an 
arm or a whole chromosome (trisomy, tetrasomy or polysomy) 
rather than true gene amplification. These 2 types of genetic 
alteration (polysomy and gene amplification) could be easily 
distinguished in the future by using an additional probe located on 
the same chromosome arm, but some distance from the target gene. 
It is noteworthy that high gene copy numbers have the greatest 
prognostic significance in breast carcinoma (Borg el ai, 1992; 
Slamon el at., 1987). 

Finally, this technique can be applied to the detection of gene 
deletion as well as gene amplification. Indeed, we found a 
decreased copy number of erbB2 (but not of the other 2 proto- 
oncogenes) in several tumors; erb&2 is located in a chromosome 
region (17q21) reported to contain both deletions and amplifica- 
tions in breast cancer (Bieche and Lidereau, 1 995). 

In conclusion, gene amplification in various cancers can be used 
as a marker of pre-neoplasia, also for early diagnosis of cancer, 
staging, prognostication and choice of treatment. Southern blotting 
is not sufficiently sensitive, and FISH is lengthy and complex. 
Real-time quantitative PCR overcomes both these limitations, and 
is a sensitive and accurate method of analyzing large numbers of 
samples in a short time. It should find a place in routine clinical 
gene dosage. 
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Genome-wjde Study of Gene Copy Numbers, 
Transcripts, and Protein Levels in Pairs of 
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Gain and loss ol chromosomal material is characteristic 
of bladder cancer, as well as malignant transformation in 
general. The consequences of these changes at both the 
transcription and translation levels is at present unknown 
partly because of technical limitations. Here we have at- 
tempted to address this question in pairs of non-invasive 
and invasive human bladder tumors using a combination 
of technology that included comparative genomic hybrid- 
ization, high density oligonucleotide array-based monitor- 
ing of transcript levels (5600 genes), and high resolution 



phenomenon at both the transcription and translation levels. 
High throughput array studies of the breast cancer cell line 
BT474 has suggested that there Is a correlation between 
DNA copy numbers and gene expression in highly amplified 
areas (2), and studies of individual genes In solid tumors 
have revealed a good correlation between gene dose and 
mRNA or protein levels in the case of c-erb-82. cyclin dl, 
emsl, and N-myc (3-5). However, a high cyclin D1 protein 
expression has been observed without simultaneous am- 



two-dfmensional gel electrophoresis(the results showed ,^P lification W> a™ 1 a low ^ve! ol c-myc copy number in- 



that there is a gene dosage effect tiiat in some cases 
superimposes on other regulatory mechanisms. This ef- 
fect depended (p < 0.015) on the magnitude of the com- 
parative genomic hybridization change. In general (18 of 
23 cases), chromosomal areas with more than 2-fold gain 
of DNA showed a corresponding increase in mRNA tran- 
scripts. Areas with loss of DNA, on the other hand, 
showed either reduced or unaltered transcript levels^ Be- 
cause most proteins resolved by two-dimensional gels 
are unknown it was only possible to compare mRNA and 
protein alterations in relatively few cases of well focused 
abundant proteins, \tfith few exceptions we found a good 
correlation (p < 0.005) between transcript alterations and 
protein levels. The implications, as well as limitations, 
of the approach are discussed. Molecular & Cellular 
Proteinics 1:37-45, 2002. 



Aneuploidy is a common feature of most human cancers 
(1), but little is known about the genome-wide effect of this 
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crease was observed without concomitant c-myc protein 
overexpression (6). 

In human bladder tumors, karyotyping, fluorescent In situ 
hybridization, and comparative genomic hybridization (CGH) 1 
have revealed chromosomal aberrations that seem to be 
characteristic of certain stages of disease progression. In the 
case of non-Invasive pTa transitional cell carcinomas (TCCs), 
this Includes loss of chromosome 9 or parts of it, as well as 
loss of Y in males. In minimally invasive pT1 TCCs, the fol- 
lowing alterations have been reported: 2q-, 11p— , 1q+, 
11q13+, 17q+, and 20q+ (7-12). It has been suggested that 
these regions harbor tumor suppressor genes and onco- 
genes; however, the large chromosomal areas involved often 
contain many genes, making meaningful predictions of the 
functional consequences of losses and gains very difficult. 

In this investigation we have combined genome-wide tech- 
nology for detecting genomic gains and losses (CGH) with 
gene expression profiling techniques (microarrays and pro- 
teomics) to determine the effect of gene copy number on 
transcript and protein levels in pairs of non-Invasive and in- 
vasive human bladder TCCs. 

EXPERIMENTAL PROCEDURES 

Material— Bladder tumor biopsies were sampled after Informed 
consent was obtained and after removal of tissue for routine pathol- 
ogy examination. By light microscopy tumors 335 and 532 were 
staged by an experienced pathologist as pTa (superficial papillary), 

1 The abbreviations used are: CGH, comparative genomic hybrid- 
ization; TCC, transitional cell carcinoma; LOH, loss of heterozygosity; 
PA-FABP, psoriasis-associated fatty acid-binding protein: 2D, 
two-dimensional. 
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Gene Copy Numbers, Transcripts, and Protein Levels 
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Fig. 1. DNA copy number and mRNA expression level Shown from left to right are chromosome (Chr.), CGH profiles, gene location and 
expression level of specific genes, and overall expression level along the chromosome. A, expression of mRNA in Invasive tumor 733 as 
compared with the non-Invasive counterpart tumor 335. S, expression of mRNA In invasive tumor 827 compared with the non-invasive 
counterpart tumor 532. The average fluorescent signal ratio between tumor DNA and normal DNA Is shown along the length of the chromosome 
(teft). The bold curve in the ratio profile represents a mean of four chromosomes and is surrounded by thin curves indicating one standard 
deviation. The central vertical line (broken) indicates a ratio value of 1 (no change), and the vertical lines next to ft (dotted) indicate a ratio of 
0.5 (feft) and 2.0 (right). In chromosomes where the non-Invasive tumor 335 used for comparison showed alterations in DNA content the ratio 
profile of that chromosome is shown to the right of the invasive tumor profile. The colored bars represents one gene each, identified by the 
running numbers above the oars (the name of the gene can be seen at www.MDL.DK/sdata.htmQ. The oars Indicate the purported location of 
the gene, and the colors indicate the expression level of the gene in the invasive tumor compared with the non-invasive counterpart; >2-fok) 
increase (Mack), >2-fokJ decrease [blue), no significant change (orange). The bar to the far right, entitled Expression shows the resulting change 
In expression along the chromosome; the colors indicate that at least half of the genes were up-regulated Iblack), at least half of the genes 
down-regulated (blue), or more than half of the genes are unchanged {orange). If a gene was absent in one of the samples and present In 
another, it was regarded as more than a 2-fold change. A 2-fold level was chosen as this corresponded to one standard deviation in a double 
determination of —1800 genes. Centromeres and heterochromatic regions were excluded from data analysis. 



grade I and II, respectively, tumors 733 and 827 were staged as pTI 
(invasive into submucosa), 733 was staged as solid, and 827 was 
staged as papillary, both grade III. 

mRNA Preparation —Tissue biopsies, obtained fresh from surgery, 
were embedded immediately in a sodhim-guanidinium thiocyanate 
solution and stored at -80 °C. Total RNA was isolated using the 
RNAzol B RNA isolation method (WAK-Chemie Medical GMBH). 
poty(A}* RNA was isolated by an oligo(dT) selection step (Ollgotex 
mRNA kit; Qiagen). 

cBNA Preparation— 1 of mRNA was used as starting material. 
The first and second strand cDNA synthesis was performed using the 
Superscript® choice system (Invitrogen) according to the manufac- 
turer's Instructions but using an oBgo(dT) primer containing a T7 RNA 
polymerase binding site. Labeled cRNA was prepared using the ME- 
GAscrip® in vitro transcription kit (Ambion). Biotin-labeled CTP and 



UTP (Enzo) was used, together with unlabeled NTPs in the reaction. 
Following the in vitro transcription reaction, the unincorporated nu- 
cleotides were removed using RNeasy columns (Qiagen). 

Array Hybridization and Scanning— Array hybridization and scan- 
ning was modified from a previous method (13). 10 j*g of cRNA was 
fragmented at 94 °C for 35 min in buffer containing 40 mM Tris 
acetate, pH 8.1, 100 mM KOAc, 30 mM MgOAc. Prior to hybridization, 
the fragmented cRNA in a 6x SSPE-T hybridization buffer (1 m Nad, 
10 mM Tris, pH 7.6, 0.00596 Triton), was heated to 95 *C for 5 min, 
subsequently cooled to 40 "C, and loaded onto the Affymetrix probe 
array cartridge. The probe array was then incubated for 16 h al 40 "C 
at constant rotation (60 rpm). The probe array was exposed to 10 
washes in 6x SSPE-T at 25 °C followed by 4 washes in 0.5X SSPE-T 
at 50 *C. The biotinylated cRNA was stained with a streptavidin- 
phycoerythrin conjugate, 10 /ig/ml (Molecular Probes) in 6x SSPE-T 
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Fig. 1 — continued 



for 30 min at 25 "C followed by 1 0 washes in 6x SSPE-T at 25 "C. The 
probe arrays were scanned at 560 run using a confocal laser scanning 
microscope (made for Affymetrix by Hewlett-Packard). The readings 
from the quantitative scanning were analyzed by Affymetrix gene 
expression analysis software. 

Microsateinte Analysis — Microsatellite Analysis was performed as 
described previously (14). Microsatellites were selected by use of 
www.ncbi.nlm.nlh.gov/genemap98, and primer sequences were ob- 
tained from the genome data base at www.gdb.org. ONA was extracted 
from tumor and Wood and amplified by PCR In a volume of 20 iii for 35 
cycles. The amplicons were denatured and etectrophoresed for 3 h in an 
ABI Prism 377. Data were collected in the Gene Scan program for 
fragment analysis. Loss of heterozygosity was defined as less than 33% 
of one allele detected in tumor amplicons compared with blood. 

Proteomic Analysis— TCCs were minced Into small pieces and 
homogenized in a small glass homogenlzer in 0.5 ml of lysis solution. 
Samples were stored at -20 °C until use. The procedure for 2D gel 
electrophoresis has been described in detail elsewhere (15, 16). Gels 
were stained with silver nitrate and/or Coomassle Brilliant Blue. Pro- 
teins were Identified by a combination of procedures that Included 
microsequendng, mass spectrometry, two-dimensional gel Western 
immunoblotting, and comparison with the master two-dimensional gel 
image of human keratinocyte proteins; see biobase.dk/cgi-bin/celis. 

CGH— Hybridization of differentially labeled tumor and normal DMA 
to normal metaphase chromosomes was performed as described 
previously (10). Ruorescein-labeled tumor DNA (200 ng), Texas Red- 



labejed reference DNA (200 ng), and human Cot-1 DNA (20 /tg) were 
denatured at 37 °C for 5 min and applied to denatured normal met- 
aphase slides. Hybridization was at 37 °C for 2 days. After washing, 
the slides were counters tained with 0.15 ng/ml 4,6-<fiamidino-2-phe- 
nylindole in an anti-fade solution. A second hybridization was per- 
formed for all tumor samples using fluorascein-labeled reference DNA 
and Texas Red-labeled tumor DNA (inverse labeling) to confirm the 
aberrations detected during the initial hybridization. Each CGH ex- 
periment also included a normal control hybridization using fluores- 
cein- and Texas Red-labeled normal DNA. Digital image analysis was 
used to identify chromosomal regions with abnormal fluorescence 
ratios, indicating regions of DNA gains and losses. The average 
green:red fluorescence intensity ratio profiles were calculated using 
four Images of each chromosome (eight chromosomes total) with 
normalization of the greerured fluorescence Intensity ratio for the 
entire metaphase and background correction. Chromosome identifi- 
cation was performed based on 4,6-diamidino-2-phenylindole band- 
ing patterns. Only images showing uniform high intensity fluores- 
cence with minimal background staining were analyzed. All 
centromeres, p arms of acrocentric chromosomes, and heterochro- 
matic regions were excluded from the analysis. 

RESULTS 

Comparative Genomic Hybridization— The CGH analysis 
identified a number of chromosomal gains and losses in the 
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Table I 

Correlation between alterations detected by CGH and by expn^lon monitoring 

Top. CGH used as independent variable (if CGH alteration - what expression ratio was found); bottom, altered expression used as 
independent variable (If expression alteration - what CGH deviation was found). 



CGH alterations 



Tumor 733 vs. 335 
Expression change clusters 



Concordance CGH alterations 



Tumor 827" vs. 532 
Expression change clusters 



Concordance 



13 Gain 



10 Loss 



10 Up-regulation 

0 Down-regulation 

3 No change 

1 Up-regulation 

5 Down-regulation 

4 No change 



77% 
50% 



10 Gain 8 Up-regulation 

0 Down-regulation 
2 No change 

12 Loss 3 Up-regulation 

2 Down regulation 
7 ^o change 



Expression change clusters 



Tumor 733 vs. 335 
CGH alterations 



Concordance Expression change clusters 



Tumor 827 vs. 532 
CGH alterations 



80% 
17% 

Concordance 



16 Up-regulation 
21 Down-regulation 
15 No change 



11 Gain 

2 Loss 

3 No change 
1 Gain 

8 Loss 

12 No change 
3 Gain 

3 Loss 

9 No change 



69% 
38% 
60% 



17 Up-regulation 
9 Down-regulation 
21 No change 



10 Gain 
5 Loss 

2 No change 

0 Gain 

3 Loss 

8 No change 

1 Gain 
3 Loss 

17 No change 



59% 
33% 
81% 



two Invasive tumors (stage pT1, TCCs 733 and 827), whereas 
the two non-invasive papillomas (stage pTa, TCCs 335 and 
532) showed only 9p-, 9q22-q33-, and X-, and 7+, 9q-, 
and Y-, respectively. Both invasive tumors showed changes 
(1q22-24+, 2q14.1-qter-, 3q12-q13.3-, 6q12-q22-, 
9q34+, 11q12-q13+, v 17+. and 20q11.2-q12+) that are typ- 
ical for their disease stage, as well as additional alterations, 
some of which are shown in Fig. 1. Areas with gains and 
losses deviated from the normal copy number to some extent, 
and the average numerical deviation from normal was 0.4-fold 
in the case of TCC 733 and 0.3-fold for TCC 827. The largest 
changes, amounting to at least a doubling of chromosomal 
content, were observed at 1q23 in TCC 733 (Fig. 1A) and 
20q12 in TCC 827 (Fig. 1B). 

mRNA Expression in Relation to DNA Copy Number— The 
mRNA levels from the two invasive tumors (TCCs 827 and 
733) were compared with the two non-invasive counterparts 
(TCCs 532 and 335). This was done in two separate experi- 
ments in which we compared TCCs 733 to 335 and 827 to 
532, respectively, using two different scaling settings for the 
arrays to rule out scaling as a confounding parameter. Ap- 
proximately 1.800 genes that yielded a signal on the arrays 
were searched in the Unigene and Genemap data bases for 
chromosomal location, and those with a known location 
(1096) were plotted as bars covering their purported locus. In 
that way it was possible to construct a graphic presentation of 
DNA copy number and relative mRNA levels along the indi- 
vidual chromosomes (Fig. 1). 

For each mRNA a ratio was calculated between the level in 
the invasive versus the non-invasive counterpart Bars, which 
represent chromosomal location of a gene, were color-coded 
according to the expression ratio, and only differences larger 



than 2-fold were regarded as informative (Fig. 1). The density 
of genes along the chromosomes varied, and areas contain- 
ing only one gene were excluded from the calculations. The 
resolution of the CGH method is very low, and some of the 
outlier data may be because of the fact that the boundaries of 
the chromosomal aberrations are neat known at high resolution. 

Two sets of calculations were made from the data. For the 
first set we used CGH alterations as the independent variable 
and estimated the frequency of expression alterations In these 
chromosomal areas. In general, areas with a strong gain of 
chromosomal material contained a cluster of genes having 
increased mRNA expression. For example, both chromo- 
somes 1q21-q25, 2p and 9q, showed a relative gain of more 
than 100% in DNA copy number that was accompanied by 
increased mRNA expression levels in the two tumor pairs (Fig. 
1). In most cases, chromosomal gains detected by CGH were 
accompanied by an increased level of transcripts In both 
TCCs 733 (77%) and 827 (80%) (Table I, fop). Chromosomal 
losses, on the other hand, were not accompanied by de- 
creased expression in several cases, and were often regis- 
tered as having unaltered RNA levels (Table I, fop). The inabil- 
ity to detect RNA expression changes In these cases was not 
because of fewer genes mapping to the lost regions (data not 
shown). 

In the second set of calculations we selected expression 
alterations above 2-fold as the independent variable and es- 
timated the frequency of CGH alterations in these areas. As 
above, we found that increased transcript expression corre- 
lated with gain of chromosomal material (TCC 733, 69% and 
TCC 827, 59%), whereas reduced expression was often de- 
tected In areas with unaltered CGH ratios (Table I, bottom). 
Furthermore, as a control we looked at areas with no alter- 



40 Molecular & Cellular Proteomics 1.1 



Gene Copy Numbers, Transcripts, and Protein Levels 



2.S 



8 

■B 2 



o 

1 



11 

444 



r 8 i 



4 
444 



J.9 



O 
O 
£ 

£.25 



o 

1 • 



444 
44 



£ 

a 

s 



J- 



Expression change* 
detected 



Expression changes 
not detected 
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not detected 



Tumor 827 versus 532 



Tumor 733 versus 335 

Fig. 2. Correlation between maximum CGH aberration and the ability to detect expression change by oligonucleotide array 
monitoring. The aberration is shown as a numerical -fold change in ratio between Invasive tumors 827 (A) and 733 ( ♦ ) and their non-invasive 
counterparts 532 and 335. The expression change was taken from the Expression line to the right In Fig. 1, which depicts the resulting 
expression change for a given chromosomal region. At least half of the mRNAs from a given region have to be either up- or down-regulated 
to be scored as an expression change. All chromosomal arms in which the CGH ratio plus or minus one standard deviation was outside the 
ratio value of one were included. 



ation In expression. No alteration was detected by CGH in 
most of these areas (TCC 733, 60% and TCC 827, 81%; see 
Table I, bottom). Because the ability to observe reduced or 
increased mRNA expression clustering to a certain chromo- 
somal area clearly reflected the extent of copy number 
changes, we plotted the maximum CGH aberrations in the 
regions showing CGH changes against the ability to detect a 
change In mRNA expression as monitored by the oligonucleo- 
tide arrays (Fig. 2)CF_or both tumors TCC 733 (p < 0.01 5) and 
TCC 827 (p < 0.00003) a highly significant correlation was 
observed between the level of CGH ratio change .(reflecting 
the DNA copy number) and alterations detected by the array 
based technology (Fig. 2| Similar data were obtained when 
areas with altered expression were used as independent vari- 
ables. These areas correlated best with CGH when the CGH 
ratio deviated 1.6- to 2.0-fold (Table I, bottom) but mostly did 
not at lower CGH deviations. These data probably reflect that 
loss of an allele may only lead to a 50% reduction In expres- 
sion level, which is at the cut-off point for detection of expres- 
sion alterations. Gain of chromosomal material can occur to a 
much larger extent 

Microsatelllte-based Detection of Minor Areas of Loss- 
es—In TCC 733, several chromosomal areas exhibiting DNA 
amplification were preceded or followed by areas with a nor- 
mal CGH but reduced mRNA expression (see Fig. 1, TCC 733 
chromosome 1q32, 2p21, and 7q21 and q32, 9q34, and 
10q22). To determine whether these results were because of 
undetected loss of chromosomal material in these regions or 



because of other non-structural mechanisms regulating trarv 
scription, we examined two mlcrosatellites positioned at chro- 
mosome 1q25-32 and two at chromosome 2p22. Loss of 
heterozygosity (LOH) was found at both 1q25 and at 2p22 
indicating that minor deleted areas were not detected with the 
resolution of CGH (Fig. 3). Additionally, chromosome 2p in 
TCC 733 showed a CGH pattern of gain/no change/gain of 
DNA that correlated with transcript increase/decrease/in- 
crease. Thus, for the areas showing increased expression 
there was a correlation with the DNA copy number alterations 
(Fig. \A). As indicated above, the mRNA decrease observed In 
the middle of the chromosomal gain was because of LOH, 
implying that one of the mechanisms for mRNA down-regu- 
lation may be regions that have undergone smaller losses of 
chromosomal material. However, this cannot be detected with 
the resolution of the CGH method. 

In both TCC 733 and TCC 827, the telomeric end of chro- 
mosome 11p showed a normal ratio in the CGH analysis; 
however, clusters of five and three genes, respectively, lost 
their expression. Two microsatellites (D11S1760, D11S922) 
positioned close to MUC2, IGF2, and cathepsln D indicated 
LOH as the most likely mechanism behind the loss of expres- 
sion (data not shown). 

A reduced expression of mRNA observed in TCC 733 at 
chromosomes 3q24, 11p11, 12p12.2. 12q21.1, and 16q24 
and in TCC 827 at chromosome 11p15.5, 12p11, 15q11.2, 
and 18q12 was also examined for chromosomal losses using 
microsatellites positioned as close as possible to the gene loci 
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Fio. 3. Microsatellite analysis of loss of heterozygosity. Tumor 
733 showing loss of heterozygosity at chromosome 1q25, detected 
(a) by D1S215 dose to Hu class I histocompatibility antigen (gene 
number 38 in Fig. 1), (t>) by 01S2735 dose to cathepsln E (gene 
number 41 in Fig. 1), and (c) at chromosome 2p23 by D2S2251 dose 
to general fi-spectrin (gene number 1 1 on Fig. 1) and of (d) tumor 827 
showing loss of heterozygosity at chromosome 18q12 by S18S1118 
close to mitochondrial 3-oxoacyl-coenzyme A thiol ase (gene number 
12 in Fig. 1). The upper curves show the electropherogram obtained 
from normal DNA from leukocytes (A/), and the lower curves show the 
electropherogram from tumor DNA (T). in all cases one allele is 
partially lost in the tumor amplicon. 

showing reduced mRNA transcripts. Only the microsatellite 
positioned at 18q12 showed LOH (Fig. 3), suggesting that 
transcriptional down-regulation of genes in the other regions 
may be controlled by other mechanisms. 

Relation between Changes in mRNA and Protein Levels— 
20-PAGE analysis, in combination with Coomassie Brilliant 
Blue and/or silver staining, was carried out on all four tumors 
using fresh biopsy material. 40 well resolved abundant known 
proteins migrating in areas away from the edges of the pH 
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Fig. 4. Correlation between protein levels as judged by 20- 
PAGE and transcript ratio. For comparison proteins were divided in 
three groups, unaltered in level or up- or down-regulated (rwrizonfa/ 
axis). The mRNA ratio as determined by digonudeotide arrays was 
plotted for each gene [vertical axis). A, mRNAs that were scored as 
present In both tumors used for the ratio calculation; A, mRNAs that 
were scored as absent in the invasive tumors (along horizontal axis) or 
as absent in noninvasive reference (top of figure). Two different 
scalings were used to exclude seating as a confounder, TCCs 827 
and 532 (4A) were scaled with background suppression, and TCCs 
733 and 335 (eO) were scaled without suppression. Both compari- 
sons showed highly significant (p < 0.0O5) differences in mRNA ratios 
between the groups, proteins shown were as follows: Group A (from 
left), phosphoglucomutase 1 , glutathione transferase dass p. number 
4, fatty add-binding protein homdogue, cytokeraUn 15, and cyto- 
keratln 13; B (from left), tatty acld-bindlng protein homdogue, 28-kOa 
heat shock protein, cytokeratin 13, and calcyclin; C<fromte/f), o-eno- 
lase. hnRNP B1, 28-kDa heat shock protein, 14-3-3-e. and 
pre-mRNA splldng factor; D, mesothelial keratin K7 (type 10; £(from 
top), glutathione S-transferase-ir and mesothelial keratin K7 (type II); 
F(from top and left), adenytyl cydase-associated protein, E-cadherin,' 
keratin 19, calgizzarin, phosphogjycerate mutase, annexln IV, cy- 
toskeletal y-actin. hnRNP A1, integral membrane protein calnexln 
(IP90), hnRNP H, brain-type ctathrin light chain-a, hnRNP F, 70-kOa 
heat shock protein, heterogeneous nuclear ribonucleoprotein A/B, 
traoslatiortaily controlled tumor protein, liver giyceraldehyde-3-phos- 
phate dehydrogenase, keratin 8, aldehyde reductase, and Na,K- 
ATPase fl-1 subunit; G, (from top and left), TCP20, cal gizza rin, 70- 
kDa heat shock protein, calnexin, hnRNP H, cytokeratin 15, ATP 
synthase, keratin 19, triosephosphate isomerase, hnRNP F; liver glyc- 
eraldehyde-3-phosphatase dehydrogenase, glutathione S-transfer- 
ase-ir, and keratin 8; H (from left), plasma gelsolin, autoantigen cal- 
reticunn, thioredoxin, and NAD+ -dependent 15hydroxyprbstaglandin 
dehydrogenase; / (from fop), prolyl 4-hydroxylase B-subunit, cyto- 
keratin 20, cytokeratin 17, prohibition, and fructose 1,6-bf phos- 
phatase; J annexln II; K, annexin IV; L [from top and leff), 90-kOa heat 
shock protein, prolyl 4-hydroxylase B-subunit, a-enolase, GRP 78, 
cyclophilin. and cofiiln. 

gradient, and having a known chromosomal location, were 
selected for analysis in the TCC pair 827/532. Proteins were 
identified by a combination of methods (see "Experimental 
Procedures"). In general there was a highly significant corre- 
lation (p < 0.005) between mRNA and protein alterations (Fig. 
4). Only one gene showed disagreement between transcript 
alteration and protein alteration. Except for a group of cyto- 
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Fig. 5. Comparison of protein and transcript levels In invasive 
and non-invasive TCCs. The upper part of the figure shows a 20 gel 
(/eft) and the oligonucleotide array (rfghf) of TCC 532. The red rectan- 
gles on the upper gel highlight the areas that are compared below. 
Identical areas of 2D gels of TCCs 532 and 827 are shown below. 
Clearly, cytokeratins 13 and 15 are strongly down-regulated in TCC 
827 (red annotation). The tile on the array containing probes for 
cytokeratin 15 is enlarged below the array (red' arrow) from TCC 532 
and is compared with TCC 827. The upper row of squares In each tile 
corresponds to perfect match probes; the lower row corresponds to 
mismatch probes containing a mutation (used lor correction for un- 
speciflc binding). Absence of signal Is depicted as black, and the 
higher the signal the fighter the color. A high transcript level was 
detected in TCC 532 (6151 units) whereas a much lower level was 
detected in TCC 827 (absence of signals). For cytokeratin 13, a high 
transcript level was also present In TCC 532 (15659 units), and a 
much lower level was present in TCC 827 (623 units). The 20 gels at 
the bottom of the figure {left) show levels of PA-FABP and adipocyte- 
FABP in TCCs 335 and 733 (invasive), respectively. Both proteins are 
down-regulated in the invasive tumor, to the right we show the array 
tiles for the PA-FABP transcript A medium transcript level was de- 
tected in the case of TCC 335 (1277 units) whereas very low levels 
were detected In TCC 733 (166 units). IEF, isoelectric focusing. 



keratins encoded by genes on chromosome 17 (Fig. 5) the 
analyzed proteins did not belong to a particular family. 26 well 
focused proteins whose genes had a know chromosomal 
location were detected in TCCs 733 and 335, and of these 19 
correlated (p < 0.005) with the mRNIA changes detected using 
the arrays . (Fig. 4). For example, PA-FABP was highly ex- 
pressed in the non-invasive TCC 335 but lost in the Invasive 
counterpart (TCC 733; see Fig. 5). The smaller number of 
proteins detected in both 733 and 335 was because of the 
smaller size of the biopsies that were available. 

11 chromosomal regions where CGH showed aberrations 
that corresponded to the changes In transcript levels also 
showed corresponding changes in the protein level (Table II). 
These regions Included genes that encode proteins that are 
found to be frequently altered in bladder cancer, namely 
cytokeratins 17 and 20, annexins II and IV, and the fatty 
acid-binding proteins PA-FABP and FBP1. Four of these pro- 
teins were encoded by genes In chromosome 17q, a fre- 
quently amplified chromosomal area In invasive bladder 
cancers. 

DISCUSSION 

Most human cancers have abnormal DNA content, having 
lost some chromosomal parts and gained others. The present 
study provides some evidence as to the effect of these gains 
and losses on gene expression in two pairs of non-invasive 
and invasive TCCs using high throughput expression arrays 
and proteomics. in combination with CGH. In general, the 
results showed that there is a clear Individual regulation of the 
mRNA expression of single genes, which In some cases was 
superimposed by a DNA copy number effect In most cases, 
genes located in chromosomal areas with gains often exhib- 
ited increased mRNA expression, whereas areas showing 
losses showed either no change or a reduced mRNA expres- 
sion. The latter might be because of the fact that losses most 
often are restricted to loss of one allele, and the cut-off point 
for detection of expression alterations was a 2-fold change, 
thus being at the border of detection. In several cases, how- 



Tabue II 

Proteins whose expression level correlates with both mRNA and gene dose changes 



Protein 



Chromosomal location Tumor TCC CGH alteration Transcript alteration* Protein alteration 



Annexinll 
AnnexinlV 
Cytokeratin 17 
Cytokeratin 20 
(PA-)FABP 
FBP1 

Plasma gelsolin 
Heal shock protein 28 
Prohibitin 
Prolyl-4-hydroxyl 
hnRNPBI 



1q21 
2p13 

17q12-q21 

17q21.1 

8q21.2 

9q22 

9q31 

15q12-q13 
17q21 
17q25 
7p15 



733 
733 
827 
827 
827 
827 
827 
827 
827/733 
827/733 
827 



Gain 
Gain 
Gain 
Gain 
Loss 
Gain 
Gain 
Loss 
Gain 
Gain 
Loss 



Abs to Pres* 
3.9-Fold up 
3.8-Fold up 

5.6- Fold up 
10-Fold down 
2.3-Fold tip 
Abs to Pres 
2.5-Fold up 

3.7- /2.S-FoW up* 
5.7-/1 .6-Fold up 
2.5-Fold down 



Increase 

Increase 

Increase 

Increase 

Decrease 

Increase 

Increase 

Decrease 

Increase 

Increase 

Decrease 



" Abs, absent Pres, present. 

0 In cases where the corresponding alterations were found in both TCC6 827 and 733 these are shown as 827/733. 
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ever, an increase or decrease in DNA copy number was arm and that the use of cDNA microarrays for analysis of DNA 

associated with de novo occurrence or complete loss of tran- copy number changes will reach a resolution that can resolve 

script, respectively. Some of these transcripts could not be these changes, as has recently been proposed (2). The outlier 

detected in the non-invasive tumor but were present at rela- data were not more frequent at the boundaries of the CGH 

tively high levels In areas with DNA amplifications in the inva- aberrations. At present we do hot know the mechanism be- 

sive tumors (ag. in TCC 733 transcript from cellular ligand of hind chromosomal aneuploldy and cannot predict whether 

annexin II gene (chromosome 1q21) from absent to 2670 chromosomal gains will be transcribed to a larger extent than 

arbitrary units; in TCC 827 transcript from small proline-rich the two native alleles. A mechanism as genetic imprinting has 

protein 1 gene (chromosome 1q12-q21.1) from absent to an impact on the expression level in normal cells and is often 

1326 arbitrary units). It may be anticipated from these data reduced In tumors. However, the relation between imprinting 

that significant clustering of genes with an increased expres- and gain of chromosomal material is not known, 

sion to a certain chromosomal area indicates an increased We regard it as a strength of this investigation that we were 

likelihood of gain of chromosomal material in this area. able to compare invasive tumors to benign tumors rather than 

Considering the many possible regulatory mechanisms act- to normal urothelium, as the tumors studied were biologically 

ing at the level of transcription, it seems striking that the gene very close and probably may represent successive steps In 

dose effects were so clearly detectable in gained areas. One the progression of bladder cancer. Despite the limited amount 

hypothetical explanation may lie In the loss of controlled of fresh tissue available it was possible to apply three different 

methylation In tumor cells (17-19). Thus, it may be possible state of the art methods. The observed correlation between 

that in chromosomes with increased DNA copy numbers two DNA copy number and mRNA expression is remarkable when 

or more alleles could be demethylated simultaneously leading one considers that different pieces of the tumor biopsies were 

to a higher transcription level, whereas In chromosomes with used for the different sets of experiments. This Indicate that 

losses the remaining allele could be partly methylated, turning bladder tumors are relatively homogenous, a notion recently 

off the process (20, 21). A recent report has documented a supported by CGH and LOH data that showed a remarkable 

ploidy regulation of gene expression in yeast, but In this case all similarity even between tumors and distant metastasis (10, 23). 

the genes were present in the same ratio (22), a situation that Is In the few cases analyzed, mRNA and protein levels 

not analogous to that of cancer cells, which show marked showed a striking correspondence although in some cases 

chromosomal aberrations, as well as gene dosage effects. we found discrepancies that may be attributed to translational 

Several CGH studies , of bladder cancer have shown that regulation, post-translatlonal processing, protein degrada- 

some chromosomal aberrations are common at certain tion, or a combination of these. Some transcripts belong to 

stages of disease progression, often occurring in more than 1 undertranslated mRNA pools, which are associated with few 

of 3 tumors. In pTa tumors, these include 9p-,9q-,1q+,Y- translationally inactive ribosomes; these pools, however, 

(2, 6), and In pT1 tumors, 2q-,1 1p-. 1 1q-, 1q+, 5p+, 8q+. seem to be rare (24). Protein degradation, for example, may 

17q+, and 20q+ (2-4, 6, 7). The pTa tumors studied here be very Important in the case of polypeptides with a short 

showed similar aberrations such as 9p- and 9q22-q33- and half-life (e.g. signaling proteins). A poor correlation between 

9q^- and Y-, respectively. Likewise, the two minimal invasive mRNA and protein levels was found in liver cells as deter- 

pT1 tumors showed aberrations that are commonly seen at mined by arrays and 2D-PAGE (25), and a moderate correla- 

that stage, and TCC 827 had a remarkable resemblance to the tion. was recently reported by Ideker ef a/. (26) In yeast 

commonly seen pattern of losses and gains, such as 1q22-24 (interestingly, our study revealed a much better correlation 

amplification (seen in both tumors), 11 q14-q22 loss, the latter between gained chromosomal areas and increased mRNA 

often linked to 17 q+ (both tumors), and 1q+ and 9p-, often levels than between loss of chromosomal areas and reduced 

linked to 20q+ and 11 q13+ (both tumors) (7-9). These ob- mRNA levels. In general, the level of CGH change determined 

servations indicate that the pairs of tumors used in this study the ability to detect a change in transcript} One possible 

exhibit chromosomal changes observed in many tumors, and explanation could be that by losing one allele the change in 

therefore the findings could be of general importance for mRNA level is not so dramatic as compared with gain of 

bladder cancer. material, which can be rather unlimited and may lead to a 

Considering that the mapping resolution of CGH is of about severatfold increase in gene copy number resulting In a much 

20 megabases it Is only possible to get a crude picture of higher Impact on transcript level. The latter would be much 

chromosomal instability using this technique. Occasionally, easier to detect on the expression arrays as the cut-off point 

we observed reduced transcript levels close to or inside re- was placed at a 2-fold level so as not to be biased by noise on 

glons with increased copy numbers. Analysis of these regions the array. Construction of arrays with a better signal to noise 

by positioning heterozygous microsatellites as close as pos- ratio may in the future allow detection of lesser than 2-fold 

sible to the locus showing reduced gene expression revealed alterations in transcript levels, a feature that may facilitate the 

loss of heterozygosity in several cases. It seems likely that analysis of the effect of loss of chromosomal areas on tran- 

multiple and different events occur along each chromosomal script levels. 



44 Molecular & Cellular Proteomlcs 1.1 



Gene Copy Numbers, Transcripts, and Protein Levels 



In eleven cases we found a significant correlation between 
DNA copy number, mRNA expression, and protein level. Four 
of these proteins were encoded by genes located at a fre- 
quently amplified area in chromosome 17q. Whether DNA 
copy number is one of the mechanisms behind alteration of 
these eleven proteins is at present unknown and will have to 
be proved by other methods using a larger number of sam- 
ples. One factor making such studies complicated is the large 
extent of protein modification that occurs after translation, 
requiring Immunoidentiflcation and/or mass spectrometry to 
correctly identify the proteins in the gels. 

In conclusion, the results presented In this study exemplify 
the large body of knowledge that may be possible to gather in 
the future by combining state of the art techniques that follow 
the pathway from DNA to protein (26). Here, we used a tradi- 
tional chromosomal CGH method, but in the future high reso- 
lution CGH based on microarrays with many thousand radiation 
hybrid-mapped genes will Increase the resolution and informa- 
tion derived from these types of experiments (2). Combined with 
expression arrays analyzing transcripts derived from genes with 
known locations, and 2D gel analysis to obtain information at 
the post-translational level, a clearer and more developed un- 
derstanding of the tumor genome will be forthcoming. 
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ABSTRACT 

Genetic changei underlie tumor progression and may lead to cancer- 
specific expression of critical genes. Over 1100 publications bave de- 
scribed the use of comparative genomic hybridization (CGH) to analyze 
the pattern of copy number alterations In cancer, but very few of the genes 
affected are known. Here, we performed hlgb-resolotion CGH analysis on 
cDNA mlcroarrays in breast cancer and directly compared copy number 
and mRNA expression levels of 13,8M genes to quantltate the impact of 
genomic changes on gene expression. We identified and mapped the 
boundaries of 24 independent ampHcona, ranging in size from 0.2 to 12 
Mb. Throughout the genome, both high- and low-level copy number 
changes had a substantial impact on gene expression, with 44% of the 
highly amplified genes showing overexpression and 10.5% of the highly 
overexpressed genes being amplified. Statistical analysis with random 
permutation tests identified 270 genes whose expression levels across 14 
samples were systematically attributable to gene amplification. These 
included most previously described amplified genes in breast cancer and 
many novel targeta for genomic alterations, including the HOXB7 gene, 
the presence of which in a novel amplkon at 17q21 J was validated in 
10.2% of primary breast cancers and associated with poor patient prog- 
nosis. In conclusion, CGH on cDNA mlcroarrays revealed hundreds of 
novel genes whose overexpression Is attributable to gene amplification. 
These genes may provide insights to the clonal evolution and progression 
of breast cancer and highlight promising therapeutic targets. 

INTRODUCTION 

Gene expression patterns revealed by cDNA raicroanays have 
facilitated classification of cancers into biologically distinct catego- 
ries, some of which may explain the clinical behavior of the tumors 
(1-6). Despite this progress in diagnostic classification, the molecular 
mechanisms underlying gene expression patterns in cancer have re- 
mained elusive, and the utility of gene expression profiling in the 
identification of specific therapeutic targets remains limited 

Accumulation of genetic defects is thought to underlie the clonal 
evolution of cancer. Identification of the genes that mediate the effects 
of genetic changes may be important by highlighting transcripts that 
are actively involved in tumor progression. Such transcripts and their 
encoded proteins would be ideal targets for anticancer therapies, as 
demonstrated by the clinical success of new therapies against ampli- 
fied oncogenes, such as ERBB2 and EGFR (7, 8), in breast cancer and 
other solid tumors. Besides amplifications of known oncogenes, over 
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Copy number ratio 




Expression ratio 

Fig. 1 . Impact of gene copy number on global gene expression levels. A. percentage of 
over- and undercipresscd genes (T axis) according to copy number ratios (X axis). 
Threshold values used for over- and undcrexpression were >2.)84 (global upper 7% of 
the cDNA ratios) and <0.4826 (global lower 7% of the expression ratios). B, percentage 
of amplified end deleted genes according to expression ratios. Threshold . values lor 
amplification and deletion were >U and <0.7. 



20 recurrent regions of DNA amplification have been mapped in 
breast cancer by CGH 9 (9, 10). However, these amplicons are often 
large and poorly defined, and their impact on gene expression remains 
unknown. 

We hypothesized that genome-wide identification of those gene 
expression changes that are attributable to underlying gene copy 
number alterations would highlight transcripts that are actively in- 
volved in the causation or maintenance of the malignant phenotype. 
To identify such transcripts, we applied a combination of cDNA and 
CGH microarrays to: (a) determine the global impact that gene copy 
number variation plays in breast cancer development and progression; 
and (b) identify and characterize those genes whose mRNA expres- 



3 The abbreviations used are: CGH, comparative genomic hybridization; FISH, fluo- 
rescence in situ hybridization; RT-PCR, reverse transcription- PCR. 
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p., , ccnonic-wWe cow Dumber and expression analysis in the MCF-7 breast cancer cell line. A. chromosomal CGH analysis of MCF-7. Toe copy number ratio profile {blue 
,, J>* si the entire aenome from lp telomere to Xq telomere is .hewn along with 4 1 SD (prang, lines). The Woe* horizontal tin* indicates a ratio of 1 .0; red Uni, a ratio of 0.8; 
"Z ereen lint, a ratio of 1 2 B-C. genome-wide copy number analysis in MCF-7 by CGH on cDNA microarray. The copy number ratios woe plotted as s function of the position 
ofthecDNA clones along the human genome. In B, individual data points are connected with a line, and a moving median of 10 adjacent clones is shown. Red horizontal lint, the 
mnv number ratio of I 0 In C. individual data points are labeled by color coding according to cDN A expression ratios. The bright red dott indicate the upper 2%, and dart red dou. 
menea 5% of the expression ratios in MCF-7 cells (overexpressed genes); bright green dots indicate the lowest 2%, and dart green dots, the next 5% of the expression ratios 
(underexpresaed genes); the rest of the observations are shown with Wort erosset. The chromosome numbers arejshown at the bottom of the figure, and chromosome boundaries are 
indicated with a dashed line. 



sion is most significantly associated with amplification of the corre- 
sponding genomic template. 

MATERIALS AND METHODS 

Breast Cancer Cell Lines. Fourteen breast cancer cell lines (BT-20, BT- 
474.HCC1428, Hs578t, MCF7, MDA-361, MDA-436, MDA-453, MDA-468, 
SKBR-3, T-47D, UACC812, ZR-75-1, and ZR-75-30) were obtained from the 
American Type Culture Collection (Manassas, VA). Cells were grown under 
recommended culture conditions. Genomic DNA and mRNA were isolated 
using standard protocols. 

Copy Number and Expression Analyses by cDNA Micro arrays. The 
preparation and printing of the 13,824 cDNA clones on glass slides were 
performed as described (11-13). Of these clones, 244 represented uncharac- 
terized expressed sequence tags, and the remainder corresponded to known 
genes. CGH experiments on cDNA microarrays were done as described (14, 
1 5). Briefly, 20 /ig of genomic DNA from breast cancer cell lines and normal 
human WBCs were digested for 14-18 h with AM and Aral (Life Technol- 
ogies, me, Rockville, MD) and purified by phenol/chloroform extraction. Six 
fig of digested cell line DNAs were labeled with Cy3-dUTP (Amersham 
PharmacU) and normal DNA with Cy5-<iUTP (Amersham Pharmacia) using 
the Bioprime Labeling kit (Life Technologies, Inc.). Hybridization (14, 15) and 
posthybridization washes (13) were done as described. For the expression 
analyses, a standard reference (Universal Human Reference UNA; Stratagene, 
La Jolla, CA) was used in all experiments. Forty fig of reference RNA were 
labeled with Cy3-dUTP and 3.5 >xg of test mRNA with CyJ-dUTP, and the 
labeled cDNAs were hybridized on microarrays as described (1 3, 1 5). For both 
microarray analyses, a laser confocal scanner (Agilent Technologies, Palo 
Alto, CA) was used to measure the fluorescence intensities at the target 
locations using the DE ARRAY software (16). After background subtraction, 
average intensities at each clone in the test hybridization were divided by the 
average intensity of the corresponding clone in the control hybridization. For 
the copy number analysis, the ratios were normalized on the basis of the 
distribution of ratios of all targets on the array and for the expression analysis 
on the basis of 88 housekeeping genes, which were spotted four times onto the 
array. Low quality measurements {i.e.. copy number data with mean reference 
intensity <100 fluorescent units, and expression data with both test and 
reference intensity <100 fluorescent units and/or with spot size <50 units) 



were excluded from the analysis and were treated as missing values. The 
distributions of fluorescence ratios were used to define cutpoints for increased/ 
decreased copy number. Genes with CGH ratio > 1 .43 (representing the upper 
5% of the CGH ratios across all experiments) were considered to be amplified, 
and genes with ratio <0.73 (representing the lower 5%) were considered to be 
deleted. 

Statistical Analysis of CGH and cDNA Microarray Data, To evaluate 
the influence of copy number alterations on gene expression, we applied the 
following statistical approach. CGH and cDNA calibrated intensity ratios were 
log-transformed and normalized using median centering of the values in each 
cell line. Furthermore, cDNA ratios for each gene across all 14 cell lines were 
median centered. For each gene, the CGH data were represented by a vector 
that was labeled I for amplification (ratio, > I -43) and 0 for no amplification. 
Amplification was correlated with gene expression using the signal-to-noise 
statistics (1). We calculated a weight, w r for each gene as follows: 

_ m t i ~ nigo 

where m tX , <r fX and or^ denote the means and SDs for the expression 
levels for amplified and nonamphfied cell lines, respectively. To assess the 
statistical significance of each weight, we performed 10,000 random permu- 
tations of the label vector. The probability that a gene had a larger or equal 
weight by random permutation man the original weight was denoted by a. A 
low a (<0.05) indicates a strong association between gene expression and 
amplification. 

Genomic Localization of cDNA Clones and Ampllcon Mapping. Each 
cDNA clone on the microarray was assigned to a Uni gene cluster using the 
Unigene Build 141.* A database of genomic sequence alignment information 
for mRNA sequences was created from the August 2001 freeze of the Uni- 
versity of California Santa Cruz's GoldenPath database. 7 The chromosome and 
bp positions for each cDNA clone were then retrieved by relating these data 
sets. Amplicons were defined as a CGH copy number ratio >2.0 in at least two 
adjacent clones in two or more cell lines or a CGH ratio >2.0 in at least three 
adjacent clones in a single cell line. The amplicon start and end positions were 
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Table I Summary of independent ampHconi In 14 breait cancer cell lines by 
CGH microarray 



Location 

lpl3 
IqZl 
Iq22 
3pl4 

7pI2.1-7pll2 

7q31 

7q32 

842l.lt~8q2t.l3 
8q2U 

8q23.3-8q24.l4 

8q24.22 

9pl3 

I3q22-q3l 

16q22 

I7qil 

I7ql2-<j2l.2 

I7q2lj2-q2l.33 

I7q22-q23.3 

I7q23.3-q243 

I9ql3 

20qll.22 

20ql3.l2 

20ql3.l2-ql3.l3 

20ql3.2-ql3.32 



Start (Mb) 


End (Mb) 


Size (Mb) 


132.79 


132.94 


0.2 


173.92 


177.25 


33 


179.28 


179.57 


0.3 


71.94 


74.66 


2.7 


55.62 


60.95 


5.3 


125.73 


130.96 


5.2 


140.01 


140.68 


0.7 


86.45 


92.46 


6.0 


98.45 


103.05 


4.6 


129.88 


142.15 


123 


151.21 


152.16 


1.0 


38.65 


39.25 


0.6 


77!l5 


81.38 


42 


86.70 


87.62 


0.9 


29.30 


30.85 


1.6 


39.79 


42.80 


3.0 


52.47 


55.80 


33 


63 .81 


69.70 


5.9 


69.93 


74.99 


5.1 


40.63 


41.40 


0.8 


34.19 


35.85 


1.3 


44.00 


45.62 


1.6 


46.43 


49.43 


3.0 


51.32 


59.12 


7.8 



CGH were validated, with lq21, 17qlZ— q21.2, 17q22-q23, 20ql3.1, 
and 20ql3.2 regions being most commonly amplified. Furthermore, 
the boundaries of these amplicons were precisely delineated In ad- 
dition, novel amplicons were identifie-d at 9pl3 (38.6S-39.2S Mb), 
and 17q21 3 (52.47-5S.80 Mb). 

Direct Identification of Putative Amplification Target Genes. 
The cDNA/CGH microarray technique enables the direct correla- 
tion of copy number and expression «iata on a gene-by-gene basis 
throughout the genome. We directly annotated high-resolution 
CGH plots with gene expression data using color coding. Fig. 2C 
shows that most of the amplified genes in the MCF-7 breast cancer 
cell line at lpl3, 17q22-q23, and 20ql3 were highly overex- 
pressed. A view of chromosome 7 in the MDA-468 cell line 
implicates EGFR as the most highly overexpressed and amplified 
gene at 7pl 1— pJ2 (Fig. 1A). In BT-4 74, the two known amplicons 
at 17ql2 and 17q22-q23 contained numerous highly overex- 
pressed genes (Fig. 3B). In addition, several genes, including the 
homeobox genes HOXB2 and HOXB 7, were highly amplified in a 
previously undescribed independent amplicon at I7q21.3. HOXB7 
was systematically amplified (as vali dated by FISH, Fig. 327. inset) 
as well as overexpressed (as verified by RT-PCR, data not shown) 
in BT-474, UACC812, andZR-75-30 cells. Furthermore, this novel 



extended to include neighboring nonamplified clones (ratio, < 1.5). The am- 
plicon size determination was partially dependent on local clone density. 

FISH. Dual-color interphase FISH to breast cancer cell lines was done as 
described (17). Bacterial artificial chromosome clone RP11-361KS was la- 
beled with SpectrumOrange (Vysis, Downers Grove, 11% and Spectrum- 
Orange-labeled probe for EGFR was obtained from Vysis. SpectrumGreen- 
labeled chromosome 7 and 17 centromere probes (Vysis) were used as a 
reference. A tissue microarray containing 612 formalin-fixed, paraffin-embed- 
ded primary breast cancers (17) was applied in FISH analyses as described 
(18). The use of these specimens was approved by the Ethics Committee of the 
University of Basel and by the NH. Specimens containing a 2-fold or higher 
increase in the number of test probe signals, as compared with corresponding 
centromere signals, in at least 10% of the tumor cells were considered to be 
amplified. Survival analysis was performed using the Kaplan-Meier method 
and the log-rank test. 

RT-PCR. The HOXB7 expression level was determined relative to 
GAPDH. Reverse transcription and PCR amplification were performed using 
Access RT-PCR System (Promega Corp., Madison, Wl) with 1 0 ng of mRNA 
as a template. HOXB7 primers were 5'-GAGCAGAGGGACTCGGACTT-3* 
and 5'-GCGTCAGOTAGCGATTGTAG-3'. 

RESULTS 

Global Effect of Copy Number on Gene Expression. 13,824 
arrayed cDNA clones were applied for analysis of gene expression 
and gene copy number (CGH nucroarrays) in 14 breast cancer cell 
lines. The results illustrate a considerable influence of copy number 
on gene expression patterns. Up to 44% of the highly amplified 
transcripts (CGH ratio, >2.5) were overexpressed (i.e.. belonged to 
the global upper 7% of expression ratios), compared with only 6% for 
genes with normal copy number levels (Fig, 1A). Conversely, 10.5% 
of the transcripts with high-level expression (cDNA ratio, >10) 
showed increased copy number (Fig. IB). Low-level copy number 
increases and decreases were also associated with similar, although 
less dramatic, outcomes on gene expression (Fig. 1). 

Identification of Distinct Breast Cancer Amplicons. Base-pair 
locations obtained for 1 1 ,994 cDNAs (86.8%) were used to plot copy 
number changes as a function of genomic position (Fig. 2, Supple- 
ment Fig. A). The average spacing of clones throughout the genome 
was 267 kb. This high-resolution mapping identified 24 independent 
breast cancer amplicons, spanning from 0.2 to 12 Mb of DNA (Table 
I). Several amplification sites detected previously by chromosomal 
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Fig. 4. List of 50 genes with » statistically 
significant correlation (a value <0.OS) between 
gene copy number and gene expression. Name, 
chromosomal location, and the o value fin- each 
gene are indicated. The gene* have beenordered 
according to their position in the genome. The color 
maps on the rig*/ illustrate the copy mmber and 
cxpreuioa ratio patterns in the 1< «" •"*»• Jhe 
key to the color code is shown at the bottom of the 
graph. Graytquara. missing values. The complete 
list of 270 genes is shown in supplemental Fig, a. 
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amplification was validated to be present in 10.2% of 363 primary 
breast cancers by FISH to a tissue microarray and was associated 
with poor prognosis of the patients (P = 0.001). 

Statistical Identification and Characterization of 270 Highly 
Expressed Genes in AmpHcons. Statistical comparison of expres- 
sion levels of all genes as a function of gene amplification identified 
270 genes whose expression was significantly influenced by copy 
number across all 14 cell lines (Fig. 4, Supplemental Fig. B). Accord- 
ing to the gene ontology data," 91 of the 270 genes represented 
hypothetical proteins or genes with no functional annotation, whereas 
179 had associated functional information available. Of these, 151 
(84%) are implicated in apoptosis, cell proliferation, signal transduc- 
tion, and transcription, whereas 28 (16%) had functional annotations 
that'could not be directly linked with cancer. 



DISCUSSION 

The importance of recurrent gene and chromosome copy number 
changes in the development and progression of solid tumors has been 
characterized in > 1000 publications applying CGH* (9, 10), as well 
as in a large number of other molecular cytogenetic, cytogenetic, and 
molecular genetic studies. The effects of these somatic genetic 
changes on gene expression levels have remained largely unknown, 
although a few studies have explored gene expression changes occur- 
ring in specific amplicons (IS, 19-21). Here, we applied genome- 
wide cDNA micro arrays to identify transcripts whose expression 
changes were attributable to underlying gene copy number alterations 
in breast cancer. 

The overall impact of copy number on gene expression patterns was 
substantial with the most dramatic effects seen in the case of high- 



• Internet address: htrpy/www.gcwontology.org/. 



' Internet address: hap^wwvf.ncbi.nlm.nih.gov/enrrez. 
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level copy number increase. Low-level copy number gains and losses 
also had a significant influence on expression levels of genes in the 
regions affected, but these effects were more subtle on a gene-by-gene 
basis than those of high-level amplifications. However, the impact of 
low-level gains on the dysregulation of gene expression patterns in 
cancer may be equally important if not more important than that of 
hiEh-level amplifications. Aneuploidy and low-level gains and losses 
of chromosomal arms represent the most common types of genetic 
alterations in breast and other cancers and, therefore, have an influ- 
ence on many genes. Our results in breast cancer extend the recent 
studies on the impact of aneuploidy on global gene expression pat- 
terns in yeast cells, acute myeloid leukemia, and a prostate cancer 

m The raH^mlcrowtay analysis identified 24 independent breast 
cancer amplicons. We defined the precise boundaries for many am- 
Dlicons detected previously by chromosomal CGH (9, 10, 25, 26) and 
also discovered novel amplicons that had not been detected previ- 
ously presumably because of their small size (only 1-2 Mb) or close 
oroximity to other larger amplicons. One of these novel amplicons 
involved the bomeobox gene region at I7q21.3 and led to the over- 
expression of the HOXB7 and HOXB2 genes. The homeodomain 
transcription factors are known to be key regulators of embryonic 
development and have been occasionally reported to undergo aberrant 
expression in cancer (27. 28). HOXB7 transfection induced cell pro- 
liferation in melanoma, breast, and ovarian cancer cells and increased 
turaorigenicity and angiogenesis in breast cancer (29-32). The pres- 
ent results imply that gene amplification may be a prominent mech- 
anism for overexpressing HOXB7 in breast cancer and suggest that 
HOXB7 contributes to tumor progression and confers an aggressive 
disease phenotype in breast cancer. This view is supported by our 
finding of amplification of HOXB7 in 10% of 363 primary breast 
cancers, as well as an association of amplification with poor prognosis 
of the patients. 

We carried out a systematic search to identify genes whose 
expression levels across all 14 cell lines were attributable to 
amplification status. Statistical analysis revealed 270 such genes 
(representing -2% of all genes on the array), including not only 
previously described amplified genes, such as HER-2, MYC, 
ECFR, ribosomal protein s6 kinase, and AIB3, but also numerous 
novel genes such as NRAS-related gene (lpl3), syndecan-2 (8q22), 
and bone morphogenlc protein (20ql3.1). whose activation by 
amplification may similarly promote breast cancer progression. 
Most of the 270 genes have not been implicated previously in 
breast cancer development and suggest novel pathogenetic mech- 
anisms Although we would not expect all of them to be causally 
involved, it is intriguing that 84% of the genes with associated 
functional information were implicated in apoptosis, cell prolifer- 
ation, signal transduction, transcription, or other cellular processes 
that could directly imply a possible role in cancer progression. 
Therefore, a detailed characterization of these genes may provide 
biological' insights to breast cancer progression and might lead to 
the development of novel therapeutic strategies. 

In summary, we demonstrate application of cDNA raicroarrays 
to the analysis of both copy number and expression levels of over 
12,000 transcripts throughout the breast cancer genome, roughly 
once every 267 kb. This analysis provided: (a) evidence of a 
prominent global influence of copy number changes on gene 
expression levels; (e) a high-resolution map of 24 independent 
amplicons in breast cancer, and (c) identification of a set of 270 
genes, the overexpression of which was statistically attributable to 
gene amplification. Characterization of a novel amplicon at 
1 7q2 1 .3 implicated amplification and overexpression of the 
HOXB7 gene in breast cancer, including a clinical association 
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between HOXB7 amplification and poor patient prognosis. Overall, 
our results illustrate how the identification of genes activated by 
gene amplification provides a powerful approach to highlight 
genes with an important role in cancer as well as to prioritize and 
validate putative targets for therapy development. 
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Genomic DNA copy number alterations are key genetic events in 
the development and progression of human cancers. Here we 
report a genome-wide microarray comparative genomic hybrid- 
ization (array CGH) analysis of DNA copy number variation In 
a series of primary human breast tumors. We have profiled DNA 
copy number alteration across 6,691 mapped human genes, in 44 
predominantly advanced, primary breast tumors and 10 breast 
cancer cell lines. While the overall patterns of DNA amplification 
and delation corroborate previous cytogenetic studies, the high- 
resolution (gene-by-gene) mapping of amplicon boundaries and 
the quantitative analysis of amplicon shape provide significant 
improvement In the localization of candidate oncogenes. Parallel 
microarray measurements of mRNA levels reveal the remarkable 
degree to Which variation in gene copy number contributes to 
variation in gene expression in tumor cells. Specifically, we find 
that 62% of highly amplified genes show moderately or highly 
elevated expression, that DNA copy number influences gene ex- 
pression across a wide range of DNA copy number alterations 
(deletion, low-, mid- and high-level amplification), that on average, 
a 2-fold change In DNA copy number is associated with a corre- 
sponding 1.5-fold change in mRNA levels, and that overall, at least 
12% of all the variation in gene expression among the breast 
tumors is directly attributable to underlying variation In gene copy 
number. These findings provide evidence that widespread DNA 
copy number alteration can lead directly to global deregulation of 
gene expression, which may contribute to the development or 
progression of cancer. 

Conventional cytogenetic techniques, including comparative 
genomic hybridization (CGH) (1), have led to the identifi- 
cation of a number of recurrent regions of DNA copy number 
alteration in breast cancer cell lines and tumors (2—4). While 
some of these regions contain known or candidate oncogenes 
[e.g., FGFR1 (8pll), MYC (8q24), CCND1 (Hql3), ERBB2 
(17q12), and ZNF217 (20ql3)j and tumor suppressor genes 
[RBI (13ql4) and TP53 (17pl3)], the relevant gene(s) within 
other regions (e.g., gain of lq, 8q22, and 17q22-24, and loss of 
8p) remain to be identified. A high-resolution genome-wide 
map, delineating the boundaries of DNA copy number alter- 
ations in tumors, should facilitate the localization and identifi- 
cation of oncogenes and tumor suppressor genes in breast 
cancer. In this study, we have created such a map, using 
array-based CGH (5-7) to profile DNA copy number alteration 
in a series of breast cancer cell lines and primary tumors. 

An unresolved question is the extent to which the widespread 
DNA copy number changes that we and others have identified 
in breast tumors alter expression of genes within involved 
regions. Because we had measured mRNA levels in parallel in 
the same samples (8), using the same DNA microarrays, we had 
an opportunity to explore on a genomic scale the relationship 
between DNA copy number changes and gene expression. From 



this analysis, we have identified a significant impact of wide- 
spread DNA copy number alteration on the transcriptional 
programs of breast tumors. 

Materials and Methods 

Tumors and Cell Lines. Primary breast tumors were predominantly 
large (>3 cm), intermediate-grade, infiltrating ductal carcino- 
mas, with more than 50% being lymph node positive. The 
fraction of tumor cells within specimens averaged at least 50%. 
Details of mdrviqual tumors have been published (8, 9), and 
are summarized in Table 1, which is published as supporting 
information on the PNAS web site, www.pnas.org. Breast cancer 
cell lines were obtained from the American Type Culture 
Collection. Genomic DNA was isolated either using Qiagen 
genomic DNA columns, or by phenol/chloroform extraction 
followed by ethanol precipitation. 

DNA Labeling and Microarray Hybridizations. Genomic DNA label- 
ing and hybridizations were performed essentially as described 
in Pollack et aL (7), with slight modifications. Two micrograms 
of DNA was labeled in a total volume of 50 microliters and the 
volumes of ail reagents were adjusted accordingly. "Test" DNA 
(from tumors and cell lines) was fhiprescently labeled (Cy5) and 
hybridized to a human cDNA microarray containing 6,691 
different mapped human genes (i.e., UniGene clusters). The 
"reference" (labeled with Cy3) for each hybridization was nor- 
ma] female leukocyte DNA from a single donor. The fabrication 
of cDNA microarrays and the labeling and hybridization of 
mRNA samples have been described (8). 

Data Analysis and Map Positions. Hybridized arrays were scanned 
on a GenePix scanner (Axon Instruments, Foster City, CA), and 
fluorescence ratios (test/reference) calculated using scanalyze 
software (available at http://rana.lbl.gov). Fluorescence ratios 
were normalized for each array by setting the average log 
fluorescence ratio for all array elements equal to 0. Measure- 
ments with fluorescence intensities more than 20% above back- 
ground were considered reliable. DNA copy number profiles 
that deviated significantly from background ratios measured in 
normal genomic DNA control hybridizations were interpreted as 
evidence of real DNA copy number alteration (see Estimating 
Significance of Altered Fluorescence Ratios in the supporting 
information). When indicated, DNA copy number profiles are 
displayed as a moving average (symmetric S-nearest neighbors). 
Map positions for arrayed human cDNAs were assigned by 
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Fta 1 Genome-wide measurement of DMA copy number alteration by array CGH. (a) DNA copy number profile are Illustrated for cell Ones containing different 
numbers of X chromosomes, for breast center call lines, and for breast tumors. Each row represents a different cell lino or tumor, and each column represent* 
one of 6 691 different mapped human genes present on the mkroarray. ordered by genome map position from 1 pier through Xqter. Moving average (symmetric 
5-nearest neighbors) fluorescence ratios (test/reference) are depicted using • logrbased pseudocolor scale (Indicated), such that red luminescence reflects 
fold-amplification, green luminescence reflects fold-deletion, and black indicates no change (gray indicates poorly measured data). (6) Enlarged view of DMA 
copy number profiles across the X chromosome, shown for cell lines containing different numbers of X chromosomes. 



identifying the starting position of the best and longest match of 
any DNA sequence represented in the corresponding UniGene 
cluster (10) against the "Golden Path" genome assembly 
(http://genome.ucsc.edu/; Oct 7, 2000 Freeze). For UniGene 
clusters represented by multiple arrayed elements, mean fluo- 
rescence ratios (for all elements representing the same UniGene 
cluster) are reported. For mRNA measurements, fluorescence 
ratios arc "mean-centered" (i.e., reported relative to the mean 
ratio across the 44 tumor samples). The data set described here 
can be accessed in its entirety in the supporting information. 

Results 

We performed CGH on 44 predominantly locally advanced, 
primary breast tumors and 10 breast cancer cell lines, using 
cDNA microarrays containing 6,691 different mapped human 
genes (Fig. la; also see Materials and Methods for details of 
microarray hybridizations). To take full advantage of the im- 
proved spatial resolution of array CGH, we ordered (fluores- 
cence ratios for) the 6,691 cDNAs according to the "Golden 
Path" (http://genome.ucsc.edu/) genome assembly of the draft 
human genome sequences (1 1). In so doing, arrayed cDNAs not 
only themselves represent genes of potential interest (e.g., 
candidate oncogenes within amplicons). but also provide precise 
genetic landmarks for chromosomal regions of amplification and 



deletion. Parallel analysis of DNA from cell lines containing 
different numbers of X chromosomes (Fig. 16), as we did before 
(7), demonstrated the sensitivity of our method to detect single- 
copy loss (45, XO), and 15- (47.XXX), 2- (4&VXXXX), or 
25-fold (49.XXXXX) gains (also see Fig. 5, which is published 
as supporting information on the PNAS web site). Fluorescence 
ratios were linearly proportional to copy number ratios, which 
were slightly underestimated, in agreement with previous ob- 
servations (7). Numerous DNA copy number alterations were 
evident in both the breast cancer cell lines and primary tumors 
(Fig. la), detected in the tumors despite the presence of euploid 
non-tumor cell types; the magnitudes of the observed changes 
were generally lower in the tumor samples. DNA copy-number 
alterations were found in every cancer cell line and tumor, and 
on every human chromosome in at least one sample. Recurrent 
regions of DNA copy number gain and loss were readily iden- 
tifiable. For example, gains within lq, 8q, 17q, and 20q were 
observed in a high proportion of breast cancer cell lines/tumors 
(90%/69%, 100%/47%, 100%/60%, and 90%/44%. respective- 
ly), as were losses within lp, 3p, 8p, and 13q (80%/24%, 
80%/22%, 80%/22%, and 7096/18%, respectively), consistent 
with published cytogenetic studies (refs. 2-4; a complete listing 
of gains/losses is provided in Tables 2 and 3, which are published 
as supporting information on the PNAS web site). The total 
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Fig. 2. ON A copy number alteration across chromosome 8 by array C6H. (a) DNA copy number profiles are illustrated for cell lines containing different numbers 
of X chromosomes, for breast cancer cell lines, and for breast tumors. Breast cancer cell lines and tumors are separately ordered by hierarchical clustering to 
highlight recurrent copy number changes. The 241 genes present on the microarrays and mapping to chromosome 8 are ordered by position along the 
chromosome. Fluorescence ratios (test/reference) are depicted by a logj pseudocolor scale (indicated). Selected genes are Indicated with color-coded text (red, 
increased; green, decreased; blade, no change; gray, not well measured) to reflect correspondingly altered mRNA levels (observed in the majority of the subset 
of samples displaying the DNA copy number change). The map positions for genes of interest that are not represented on the mlcroarray are indicated in the 
row above those genes represented on the array, (o) Graphical display of DNA copy number profile for breast cancer cell line SKBR1 Fluorescence ratios 
(tumor/normal) are plotted on a log] scale for chromosome 8 genes, ordered along the chromosome. 



number of genomic alterations (gains and losses) was found to 
be significantly higher in breast tumors that were high grade (P = 
0.008), consistent with published CGH data (3), estrogen recep- 
tor negative (P = 0.04), and harboring TP53 mutations {P — 
0.0006) (see Table 4, which is published as supporting informa- 
tion on the PNAS web site). 

The improved spatial resolution of our array CGH analysis is 
illustrated for chromosome 8, which displayed extensive ONA 
copy number alteration in our series. A detailed view of the 
variation in the copy number of 241 genes mapping to chromo- 
some 8 revealed multiple regions of recurrent amplification; 
each of these potentially harbors a different known or previously 
uncharacterized oncogene (Fig. 2a). The complexity of amplicon 
structure is most easily appreciated in the breast cancer cell line 
SKBR3. Although a conventional CGH analysis of 8q in SKBR3 
identified only two distinct regions of amplification (12), we 
observed three distinct regions of high-level amplification (la- 
beled 1-3 in Fig. lb). For each of these regions we can define the 



boundaries of the interval recurrently amplified in the tumors we 
examined; in each case, known or plausible candidate oncogenes 
can be identified (a description of these regions, as well as the 
recurrently amplified regions on chromosomes 17 and 20, can be 
found in Figs. 6 and 7, which are published as supporting 
information on the PNAS web site). 

For a subset of breast cancer cell lines and tumors (4 and 37, 
respectively), and a subset of arrayed genes (6,095), mRNA 
levels were quantitatively measured in parallel by using cDNA 
microarrays (8). The parallel assessment of mRNA levels is 
useful in the interpretation of DNA copy number changes. For 
example, the highly amplified genes that are also highly ex- 
pressed are the strongest candidate oncogenes within an ampli- 
con. Perhaps more significantly, our parallel analysis of DNA 
copy number changes and mRNA levels provides us the oppor- 
tunity to assess the global impact of widespread DNA copy 
number alteration on gene expression in tumor cells. 

A strong influence of DNA copy number on gene expression 
is evident in an examination of the pseudocolor representations 
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Hg. 3. Concordance between ONA copy number and gene expression across chromosome 17. DMA copy number alteration (Upper) and mRNA levels (tower) 
are illustrated for breast cancer cell lines and tumors. Breast cancer call lines and tumors are separately ordered by hierarchical clustering (Upper), and the 
identical sample order is maintained (Lowe/). The 354 genes present on the microarrays and mapping to chromosome 1 7, and for which both ON A copy number 
and mRNA levels were determined, are ordered by position along the chromosome; selected genes are indicated in color-coded text (see Fig. 2 legend). 
Fluorescence ratios (test/reference) are depicted by separate iogj pseudocolor scales (indicated). 



of DNA copy number and mRNA levels for genes on chromo- 
some 17 (Fig. 3). The overall patterns of gene amplification and 
elevated gene expression are quite concordant; i.c, a significant 
fraction of highly amplified genes appear to be correspondingly 
highly expressed. The concordance between high-level amplifi- 
cation and increased gene expression is not restricted to chro- 
mosome 17. Genome-wide, of 117 high-level DNA amplifica- 
tions (fluorescence ratios >4, and representing 91 different 
genes), 62% (representing 54 different genes; see Table 5, which 
is published as supporting information on the PNAS web site) 
are found associated with at least moderately elevated mRNA 
levels (mean-centered fluorescence ratios >2), and 42% (rep- 
resenting 36 different genes) are found associated with compa- 
rably highly elevated mRNA levels (mean-centered fluorescence 
ratios >4). 

To determine the extent to which DNA deletion and lower- 
level amplification (in addition to high-level amplification) are 
also associated with corresponding alterations in mRNA levels, 
we performed three separate analyses on the complete data set 
(4 cell lines and 37 tumors, across 6,095 genes). First, we 
determined the average mRNA levels for each of five classes 
of genes, representing DNA deletion, no change, and low-, 
medium-, and high-level amplification (Fig. 4a). For both the 



breast cancer cell lines and tumors, average mRNA levels 
tracked with DNA copy number across all five classes, in a 
statistically significant fashion (P values for pair-wise Student's 
/tests comparing adjacent classes: cell lines, 4 x 10~ 49 ,1 x 10" w , 
5 X irr 3 , 1 X 10"*; tumors, 1 X 1Q-* 3 , l x 10" iM , 5 X IQ~*\ 
1 X 10~ 4 ). A linear regression of the average log(DNA copy 
number), for each class, against average log(mRNA level) 
demonstrated that on average, a 2-fold change in DNA copy 
number was accompanied by 1.4- and 1 .5-fold changes in mRNA 
level for the breast cancer cell lines and tumors, respectively (Fig. 
4a, regression line not shown). Second, we characterized the 
distribution of the 6,095 correlations between DNA copy num- 
ber and mRNA level, each across the 37 tumor samples (Fig. 4b). 
The distribution of correlations forms a normal-shaped curve, 
but with the peak markedly shifted in the positive direction from 
zero. This shift is statistically significant, as evidenced in a plot 
of observed vs. expected correlations (Fig. 4c), and reflects a 
pervasive global influence of DNA copy number alterations on 
gene expression. Notably, the highest correlations between DNA 
copy number and mRNA level (the right tail of the distribution 
in Fig. 4b) comprise both amplified and deleted genes (data not 
shown). Third, we used a linear regression model to estimate the 
fraction of all variation measured in mRNA levels among the 37 
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tumors that could be attributed to underlying variation in DNA 
copy number. From this analysis, we estimate that, *«* 
7% of all of the observed variation in mRNA levels can be 
exolained direcdy by variation in copy number of the altered 
ecnes fFia. <W). We can reduce the effects of experimental 
measurement error on this estimate by using only that fraction 
of the data most reliably measured (fluorescence intensity/ 
background >3); using that data, our estimate of the percent 
variation in mRNA levels directly attributed to variation in gene 
cow number increases to 12% (Fig. 4rf). This still undoubtedly 
represents a significant underestimate, as the observed variation 
in global gene expression is affected not only by true variation m 
he expression programs of the tumor cells themselves, but also 
by the variable presence of non-tumor cell types within clinical 
samples. 

Discussion 

This genome-wide, array CGH analysis of DNA copy number 
alteration in a series of human breast tumors demonstrates the 
usefulness of defining amplicon boundaries at high resolution 
( eene-by-gene), and quantitatively measuring amplicon shape, to 
assist in locating and identifying candidate oncogenes. By ana- 
Jvzine mRNA levels in parallel, we have also discovered that 
changes in DNA copy number have a large, pervasive, direct 
effect on global gene expression patterns in both breast cancer 
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cell lines and tumors. Although the DNAmicroarrays used in our 
analysis may display a bias toward characterized and/or highly 
expressed genes, because we are examining such a large fraction 
of the genome (approximately 20% of all human genes), and 
because, as detailed above, we are likely underestimating the 
contribution of DNA copy number changes to altered gene 
expression, we believe our findings are likely to be generalizable 
(but would nevertheless still be remarkable if only applicable to 
this set of -6,100 genes). 

In budding yeast, aneuploidy has been shown to result in 
chromosome-wide gene expression biases (13). Two recent 
studies have begun to examine the global relationship between 
DNA copy number and gene expression in cancer cells. In 
agreement with our findings, Phillips et al. (14) have shown that 
with the acquisition of tumorigenicity in an immortalized pros- 
tate epithelial cell line, new chromosomal gains and losses 
resulted in a statistically significant respective increase and 
decrease in the average expression level of involved genes. In 
contrast, Platter et al (15) recently reported that in metastatic 
colon tumors only -4% of genes within amplified regions were 
found more highly (>2-fold) expressed, when compared with 
normal colonic epithelium. This report differs substantially from 
our finding that 62% of highly amplified genes in breast cancer 
exhibit at least 2-fold increased expression. These contrasting 
findings may reflect methodological differences between the 
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the genomic distribution of expressed genes, even within existing 
microarray gene expression data sets, may permit the inference 
of DNA copy number aberration, particularly aneuploidy (where 
gene expression can be averaged across large chromosomal 
regions; see Fig. 3 and supporting information). Fifth, this 
finding implies that a substantial portion of the phenotypic 
uniqueness (and by extension, the heterogeneity in clinical 
behavior) among patients' tumors may be traceable to underly- 
ing variation in DNA copy number. Sixth, this finding supports 
a possible role for widespread DNA copy number alteration in 
tumorigenesis (17, 18). beyond the amplification of specific 
oncogenes and deletion of specific tumor suppressor genes. 
Widespread DNA copy number alteration, and the concomitant 
widespread imbalance in gene expression, might disrupt critical 
stochioraetric relationships in cell metabolism and physiology 
(eg., proteosome, mitotic spindle), possibly promoting further 
chromosomal instability and directly contributing to tumor 
development or progression. Finally, our findings suggest the 
possibility of cancer therapies that exploit specific or global 
imbalances in gene expression in cancer. 
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Each year, over 182,000 women in the United States are 
diagnosed with breast cancer, and approximately 45,000 die 
of the disease. 1 Incidence appears to be increasing in the 
United States at a rate of roughly 2% per year. The reasons 
for the increase are unclear, but non-genetic risk factors appear 
to play a large role. 2 

Five-year survival rates range from approximately 65%- 
85%, depending on demographic group, with a significant 
percentage of women experiencing recurrence of their cancer 
within 10 years of diagnosis. One of the factors most predic- 
tive for recurrence once a diagnosis of breast cancer has been 
made is the number of axillary lymph nodes to which tumor 
has metastasized. Most node-positive women are given adju- 
vant therapy, which increases their survival. However, 20%- 
30% of patients without axillary node involvement also 
develop recurrent disease, and the difficulty lies in how to iden- 
tify this high-risk subset of patients. These patients could 
benefit from increased surveillance, early intervention, and 
treatment. 

Prognostic markers currently used in breast cancer recur- 
rence prediction include tumor size, histological grade, steroid 
hormone receptor status, DNA ploidy, proliferative index, and 
cathepsin D status. Expression of growth factor receptors and 
over-expression of the HER-2/neu oncogene have also been 
identified as having value regarding treatment regimen and 
prognosis. 

HER-2/neu (also known as c-erbB2) is an oncogene that 
encodes a transmembrane glycoprotein that is homologous 
to, but distinct from, the epidermal growth factor receptor. 
Numerous studies have indicated that high levels of expres- 
sion of this protein are associated with rapid tumor growth, 
certain forms of therapy resistance, and shorter disease-free 
survival. The gene has been shown to be amplified and/or 
overexpressed in 1 0%-30% of invasive breast cancers and in 
40%-60% of intraductal breast carcinoma. 3 

There are two distinct FDA-approved methods by which 
HER-2/neu status can be evaluated: immunohistochemistry 
(IHC, HercepTest™) and FISH (fluorescent in situ hybridiza- 
tion, PathVysion™ Kit). Both methods can be performed on 
archived and current specimens. The first method allows visual 
assessment of the amount of HER-2/neu protein present on 
the cell membrane. The latter method allows direct quantifi- 
cation of the level of gene amplification present in the tumor, 
enabling differentiation between low- versus high-amplifica- 
tidh. At least one study has demonstrated a difference in 



recurrence risk in women younger than 40 years of age for 
low- versus high-amplified tumors (54.5% compared to 
85.7%); this is compared to a recurrence rate of 16.7% for 
patients with no HER-2/neu gene amplification. 4 HER-2/neu 
status may be particularly important to establish in women with 
small (< 1 cm) tumor size. 

The choice of methodology for determination of HER-2/ 
neu status depends in part on the clinical setting. FDA approval 
for the Vysis FISH test was granted based on clinical trials 
involving 1549 node-positive patients. Patients received one 
of three different treatments consisting of different doses of 
cyclophosphamide, Adriamycin, and 5-fluorouracil (CAF). 
The study showed that patients with amplified HER-2/neu 
benefited from treatment with higher doses of adriamycin- 
based therapy, while those with normal HER-2/neu levels did 
not. The study therefore identified a sub-set of women, who 
because they did not benefit from more aggressive treatment, 
did not need to be exposed to the associated side effects. In 
addition, other evidence indicates that HER-2/neu amplifica- 
tion in node-negative patients can be used as an independent 
prognostic indicator for early recurrence, recurrent disease at 
any time and disease-related death. 5 Demonstration of HER- 
2/neu gene amplification by FISH has also been shown to be 
of value in predicting response to chemotherapy in stage-2 
breast cancer patients. 

Selection of patients for Herceptin 0 (Trastuzumab) mono- 
clonal antibody therapy, however, is based upon demonstra- 
tion of HER-2/neu protein overexpression using HercepTest™. 
Studies using Herceptin 0 in patients with metastatic breast 
cancer show an increase in time to disease progression, 
increased response rate to chemotherapeutic agents and a small 
increase in overall survival rate. The FISH assays have not yet 
been approved for this purpose, and studies looking at response 
to Herceptin e in patients with or without gene amplification 
status determined by FISH are in progress. 

In general, FISH and IHC results correlate well. However, 
subsets of tumors are found which show discordant results; 
i.e., protein overexpression without gene amplification or lack 
of protein overexpression with gene amplification. The clini- 
cal significance of such results is unclear. Based on the above 
considerations, HER-2/neu testing at SHMC/PAML will uti- 
lize immunohistochemistry (HercepTest 0 ) as a screen, fol- 
lowed by FISH in IHC-negative cases. Alternatively, either 
method may be ordered individually depending on the clini- 
cal setting or clinician preference. 
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CPT code information 

HER-2/oeu via IHC 

88342 (including interpretive report) 

HER-2/neu via FISH 

88271 x2 Molecular cytogenetics, DNA probe, each 
88274 Molecular cytogenetics, interphase in situ hybrid- 
ization, analyze 25-99 cells 
88291 Cytogenetics and molecular cytogenetics, interpre- 
tation and report 

Procedural Information 

Immunohistochemistry is performed using the FDA-approved 
DAKO antibody kit, Herceptest© The DAKO kit contains 
reagents required to complete a two-step immunohisto- 
chemical staining procedure for.routinely processed, paraffin- 
embedded specimens. Following incubation with the primary 
rabbit antibody to human HER-2/neu protein, the kit employs 
a ready-to-use dextran-based visualization reagent. This re- 
agent consists of both secondary goat anti-rabbit antibody 
molecules with horseradish peroxidase molecules linked to a 
common dextran polymer backbone, thus eliminating the need 
for sequential application of link antibody and peroxidase 
conjugated antibody. Enzymatic conversion ©f the subse- 
quently added chromogen results in formation of visible 
reaction product at the antigen site. The specimen is then coun- 
terstained; a pathologist using light-microscopy interprets 
results. 

FISH analysis at SHMC/PAML is performed using the 
FDA-approved Path Vysion™ HER-2/neu DNA probe kit, pro- 
duced by Vysis, Inc. Formalin fixed, paraffin-embedded breast 
tissue is processed using routine histological methods, and then 
slides are treated to allow hybridization of DNA probes to the 
nuclei present in the tissue section. The Pathvysion™ kit con- 
tains two direct-labeled DNA probes, one specific for the 
alphoid repetitive DNA (CEP 1 7, spectrum orange) present at 
the chromosome 17 centromere and the second for the HER- 
2/neu oncogene located at 17ql 1.2-12 (spectrum green). Enu- 
meration of the probes allows a ratio of the number of copies 
of chromosome 17 to the number of copies of HER-2/neu to 
be obtained; this enables quantification of low versus high 
amplification levels, and allows an estimate of the percentage 
of cells with HER-2/neu gene amplification. The clinically 
relevant distinction is whether the gene amplification is due 
to increased gene copy number on the two chromosome 17 
homologues normally present or an increase in the number of 
chromosome 17s in the cells. In the majority of cases, ratio 
equivalents less than 2.0 are indicative of a normal/negative 
result, ratios of 2.1 and over indicate that amplification is 
present and to what degree. Interpretation of this data will be 
performed and reported from the Vysis-certified Cytogenet- 
ics laboratory at SHMC. 
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ABSTRACT Wat family members are critical to many 
developmental processes, ond components of the Wat Sicnal- 
inr pa thwoy have been linked lo tumorlgcnesls in familial and 
sporadic colon carcinomas. Here we report the identification 
of two zenes, WJSP-1 and WTSP-2, that are up-reculalcd in the 
mouse mammary epithelial cell line C57MG iransrormeo by 
Wnt-l, but not by WnM. ToeeiDer with a tolrd related sene, 
WISP-), these proteins fiefine a sunrnmiiy of the connective 
tissue grown taoor family. Two distinct systems aeman* 
Sintefl WTSF Induction to be associated with the expression or 
Wnt-J. These included (/) CS7MG cetlc infected with a Wni4, 
retroviral vector or expressing Wnt-1 under the control of a 
tstracytine reprouible promoter, and (u) Wnt-1 transgenic 
mice. The WJSPJ gend waa localized to human cbromosotne 
8q24J-8q2d.3. V7SP-1 genomic DNA was amplified in colon 
cancer cell lino* and in human colon tumors and its UNA 
overexprcs-sod (2- to >30-fold) in 84* of the tumors examined 
compared with patient-matched normal mucosa. WfSP-3 
mapped to chromosome (q22-6q23 and also was orerex- 
pressed (4- to >40-fold) in 63% of the colon tumors analyzed. 
In contrast, WlSP-2 mapped to Human chromosome 20ul2- 
20ql3 aad its DNA was amplified, but UNA expression was 
reduced (2- to >30-foltl) Id 1$% of the tumon. Tnese results 
suggest that the WMT genes may be downstream of Wnt-1 
slgnallne and thai soerranr. levels ot wrsr expression in colon 
cancer may play a role in colon r.umorigcncsis. 

Wnt-1 it a member of in expanding family of eysteine-rich, 
glycosylated signaling proteins thai mediate diverse develop- 
menial processes such es the central of cell proliferation, 
adhesion, cell polarity, and the establishment of cell fetes (1, 
2). Wnt-1 originally was identified as an oncogene activated by 
the insertion of mouse mammary tumor virus in virus-induced 
mammary adenocarcinomu (3. 4). Although Wm-1 is nnt 
expressed in the normal mammary gland, expression of Wnt-1 
in transgenic mice causes mammary tumors (J). 

In mammalian cells. Wnt timlly members initiate signaling 
by binding to Che seven-rransmembrane spanning Frizzled 
receptors and recruiting the cytoplasmic protein Dishevelled 
(Dsb) 10 the ce'l memOrane (1, 2, 6;. Dsh then inhibits the 
kinase activity ot the normally censtitucivdy active glycogen 
synthase Jclnase-30 (G!>'K»3£) resulting in an ir.crea.ee In 
p-catenln levels. Stabilized 0-c'atenin interacts with the tran- 
scription factor TLF/Lef 1, forming a complex that appears in 
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(he nucleus and binds TCF/Lefl target DNA elements to 
activate transcription (7, 6). Other experiments suggest lilac 
the adenomatous polyposis coli (A PC) lumor suppressor gene 
also plays an important role in Wnt signaling by regulating 
P-eatenin lovaU (9). APC is phosphorylutcd by GSK-30, binda 
■■ to 0-eatenin, encf facilitates ita def>redatiou. Mutolions In 
either APC or /3-eotonin have been iissneiated with colon 
caxetnomej end melanomas, suggesting these mutations con- 
tribute to the development oftheso type;, of cancer, implicating 
the Wat pathway in tumoi foenesia fl). 

Although much haa beat learned about the Wnt sigriab'ne 
pathway over the peat several years, only a few of the tran- 
scriptionally activatod dcwrtslrearfl componaita activatee by 
Wnt have been characterized. Those the! have been described 
cannot account for all of the diverse function! attributed to 
Wnt signaling. Among the candidate Wnt target genes are 
' those encoding the nodal.ralated 3 gena, Xnr3, a member of 
the transforming growth tactcr (TOr>£l superfamily, and the 
Qomeosox genes, engniled,gooseecid,min yCewn), tnitiomou 
(2). A recent report also ixient<fies c-m>'c as a target gene of the 
Wnt signaling pathway (10). 

To Identify additional downstream genci In the Wnt signal- 
ins pa'Jiway thai, are relevant to the transurmed cell pheoo. 
type, we used a PCR-based dDNA subtraction strategy, sup- 
pression subtractlve hybridization (S5H) (M), using kna 
isniated from C17MO mouse mammary epimetial cells and 
CJ7MO wllssiahly tranisrormed by a Wnt-1 retrovirus. Over- 
expression of Wnl-1 in Uiis cell line is Kuflident ui induce a 
partially transformed phenorypc, charaacrlzcd by elongated 
and rcfractile cells that lose contact inhibition and form a 
multilcycred array (12, 13). We rciisoncd that genes di/Ccren- 
tiolly expressed between tliese two cell tines might contribute 
to the transformed phenorypc. 

In this paper, we describe the donint: and characterization 
of two genes unregulated in Wnt-1 transformed cells, OPlSP-1 
and WTSP-2, one a third related gene, W1SP-3. The WIS? genes 
are mombers of the CCN famDy of crowth factors, which 
includes connective tissue growth factor (CTCiF), Cyr61, and 
nov, a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH. SSH was performed by using the PCR-Salect eDNA 
Subtraction Kit (CLONTBCH). T**.ter doublo-slronded 

Alhrevlatlons: TCF. trsnstbeming growth txcUtr. CTCF, connenive 
duun tvruwih factor; SSH, suppression stiolrectlve nyonaitatlon; 
VWC, von WilUbrand factor typo C module, 
□iti depoiition: The Mqmntu xeportad m ih« picer htve b«an 
aeposited in the Oenbanit diTabaie (accession nos. M\wm. 
Ari00773. AF100779, AF100780 <uiJ API 007b I). 
TTo whom fopnnl reeueab ihoula be addrnned. e-mail: diano@geiie. 
com. 
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eDNA w« synthesized from 2 « of poly(A)' RNA isolated 
from the CJ/uO/Wnt-l cell line and driver cDNA from 2 ug 
of poly(A)* RNA from the parent C37MO cells. The sub- 
tracted cDNA library wit subeloned InlO 1 pGEM-T vector for 
further onalysis. 

eDNA Library Sersentng. Clones encoding full-length 
mouse WISP-3 *»ere isolated by screening a AEtlO mouse 
embryo cDNA library (CLONTECH) with a 7i i-bp probe from 
the oricinal partial done 568 sequence corresponding to amino 
acids 128-169. Clones encoding full-length human WTSP-1 
were isolated by screening AgtlO lun B feUI kidnc ¥ 
libraries! with the same probe at low Stringency. Clones cn- 
. coding full-length mouse and human WISP J were itolited by 
screening a CS7MG/Wnt-1 or human fetal long el^A lib«ry 
with a probe corresponding to nucleotides 1463-iJii. fuu- 
length cnNAs encoding WlSP-3 were cloned from human 
bone marrow and fetal kidney libraries. _ 

Expression oi Human TCV RNA. PCR amplification of 
first-strand cUNa was performed with human Multiple Tiaaue 
cDNA panel* (f^UNTECK) and 300 uM of etch dNTP at 
94'C f ot 1 tee, 62°C tor 30 sec. 72'C for 1 min, for 22-32 cycles. 
WISP and gtyceraldahyde-i-ptiosphate dehydrogenase punier 
soquencee are available on request. 

In SU» Hybridization. M P-lahelcd sense and anusense nbo- 
probes were transcribed from an K«7-bp Pi'A product corre- 
sponding us nucleotides 601-1440 et mouse WT.1P-1 or -a 
294-bp PCR product corresponding to nucleotides 8?,-371 of 
mouse WISP-2. All tissues were processed a* described fan). 

Radiation Hybrid Mapping Genomic UNA from each 
hyUrtd In the Stanford G3 and Genebridgea Radiation Hybnd 
Panels (Research Genetics, Hunlsvflle, AL) and human and 
hamster eontrol DNAS were PCR-ompIified, and the reruns 
were submitted 10 Hie Stanford or Massachusetts Institute of 
Technology web servers. 

CeU lines, rumors, and Mucosa Specimens. Tissue speci- 
mens were cetained from the Department of Patholow (Uni- 
versity of Pittsburgh) lor patients undergoing colon resection 
and from the University of iceflS. United Kingdom. Genomic 
DNA was isolated (CJi»jjen) from trie ponied blood of 10 
normal human donors, surgical specimens, and the following 
ATCC human cell lines: bvaso, COLO WDM. HT-29, 
WiDr, and SW403 (eolon adenocarcinomas). 5W620 (lyroph 
node molaataafa, colon adenocarcinoma). KCT U6 (colon 
carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
HM7 (a veriant of ATCC colon adenocarcinoma cell line LS 
174T). DNA concentration was determined bv using HoechSl 
Oye 3325B intercalation f luorimetry. Total RNA *a prepared 
by nomriuenization in 7 M GuSCN followed by cenirifugatipn 
over CsCl cushions or prepared by using RNAaol.. 

i;ene Amplification and RNA Expression Analysis, Relative 
gene arepliGcarJon and RNA capresjion of WISPs and c-«iy« m 
the cell lines, colorectal tumors, and normal mucosa were 
determined by quantitative f CR. Gene-speciEc primers and 
fluorogenie probes (sequences available on request) ware 
designed and used to amplify and quantitate the acnes. Tha 
relative gene copy numoer was derived by using the formula 
2WO) where ACt represents the ftlffcrencc in amplification 
cycles required to deteet tnt WISP genes in peripheral blood 
rymphooyle DNA compared with colon tumor DNA 0' colon 
tumor RNA compared with normal mucosal RNA The 
0-melhod was used for calculation ol we SE of 'he jenc enpy 
number or RNA expression level. The if/if-specific Signal was 
normalized lo that of the glyeeraldehyde-J-phOSpriSte dehy- 
drogenase housekeeping gene. All TaqMan assay reageoB 
weie obtained from Pcrklrt-Elmer Applied Biosyscems. 

RESULTS 

Isolation or WISP-I and WISP-3 by SSH. To identify Wnl- 
MndueiWe genes, wc used the technique of SSH using the 
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mouse mammary epithelial cell line C57MO and CS7MG eelb 
that stably express wnH (U). Candidate differentially ex- 
pressed eDNAs (1,38* total) were sequenced. TWrry-ninc 
percent of the tequeoeos matched kmr*n genes or homo- 
logues. 32% matched esprened soquenoe tags, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse tninecription-PCR and 
Northern analysis were performed by using uRNA from the 
CS7MG and CS7MG/Wm-l cells. 

Two of the oDNAs, WlSP-3 and WISP-*, were differentially 
expressed, being ihdueed in the Clf'MCJ/Wnl-l cell line, hut 
not in the parent C57M.G cells or C57MU cells oveiexpresslng 
Wut-4 (fig. I A and B). Wnl-*. unlike Wni.l, does noi induce 
the morphologieal traiisformaiion of CS7MG cell* and has no 
effect on (t-catenln levels (13, 14). Expression of WISP-) woe 
up-ragulated approximately 3-fold in *e C57MG/WnM cell 
line and WISP-3 by spproxlmately 1-told by both Norlbcm 
analysis and reverse traascription-*CB . 

An indopsndent, but similar, system w«s used to txamine 
WISP expression after Wnt-1 ioducuon. CS7MG cells express- 
ing the Wnt-1 eene under the eenirol of a letrieycUne- 
reprcsslble promoter produce low amounts of Wnt-1 in the 
repressed State but Show a strong induction of Wnt-1 mRNA 
and protein witnln 24 hr after teiracycline removal (8). The 
levols of Wni-1 and wiSP RNA Isolated from these cells at 
varioue times after teuacydtne remtrval were assessed by 
quantitative PCR- Strong induction of Wnt-1 mRNA was Seen 
as early as 10 hr aftor tetracycline removal. Induction Of Kf.W 
mRNA (2- to (Hold) was seen et 48 and 72 hr (data not shown). 
These data support OUT previous observations that ahow thai 
WISP induction Is correlated wilh Wat-1 expression. Because 
the induction Is slow, occurring after approximately *S hr, the 
induction of WISPt may be an inHlrect response to Wnt-1 
signaling. 

cDNA clones of human WISP-] were isolated and the 
sequence compared with mouse WISP- 1. The cDNA sequences 
. ofmouse and human WISP-1 were 1,766 end 2,830 bp ia length, 
respectively, and encode proteins of 367 an, with predicted 
relative molecular masses of -4O.000 (M, 40 K). Both have 
hydrophobic W-terminal signal scquentes, 38 conserved ove- 
teine residues, and four potential N-ltnVced glycosylation sicca 
end are 84% identical (Fig. 

Pull-length cDNA clones of mouse and human WTSP-2 were 
1 .734 and 1 ,233 bp in Icnsth, respeccivjly. ind encode proteins 
of 251 anfl 7.10 aa, respectively, with predicted rolativa molec 
ular matces of -77.0f>fl (fit, 27 tC) (Fie. IB). Mouse and human 
WtSP-2 are >3% identical. Human W1SP-2 has no potential 
N-linked glycosylates sites, and mouse WISP-Z has one t.t 




Fie. 1. i»7iP-tsatlB7'.Tr , -2areinduei;dbTWnt-l,fcuinotWnt.t, 
expression In C37MO ull» Korthcrn analysis of WlSP-l {A) mi 
WlSF-2 (B) enmeuicm in C57MC, CS'/Mfi/Wnt-l. snd r.TJMfi/ 
Wnt-4 eclb. Poly(A)» RNA (J ng) w.» cuhjceieti to r^or'hern Blot 
unatnk and hybnd Mi with » 70.6p mouke WT.tr'-r-specinu prove 
(uniDo seidi :7S JUU) or S l«n.t>p W7.T/--.>-st»ccili«; wtobe (ouelooti4o« 
U>8-lfi27)intheV untrarulated tejlon. Blutsw«-crelijfbri<ii»odw.ih 
human (Mctia probe. 
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F, c . j. Encoded ammo !CW *«»^j2^w ofin^ wl 
human IH3M W ind mouse anc llw- WM (* )• £• £ 
.ignai sequence Insulin-lite ^«« h i^ w '* m ^« S 
vwrv liuombuJiivjndin (TSP), and CwrmmJ (CO noroaraS are 
undeiliited. 

position 197. WISP J has « cysteine residua that ere con- 
served amona the 38 cysteines foonO In WISM. 

Identification of WISP-3. To watch tor related protein, we 
sewned expressed sequence tag (EST) , cabases wjfc the 
WlSP-l protein sequence and identified several ETO as 
potentially related sequences. W. identify a luimologQUS 
wottin it we nave called WISM. A fulWength tinman 
WISP-3 cDNA of 1,37). Dp was isolated corresponding to tnose 
EST* that enccee a 334-aa protein with a prodded molecular 
niDl of 39,293. WISP-3 has two potential N-Unked ejyeoayl- 
ation sit* and 36 cysteine residues. An alignment of the three 
Human WISP proteins show 'hat WlSP-l and WISP-3 are the 
mostslmflar (42% identity), whereas 
with WJSM and 32% identity with WISP-3 (.►'B 

w/SPr Are Homologous to the CTCF Family or Pratelns. 
Human WISP-J. WJSP-l. and WISP-3 ere novel te^ences: 
however, mouse WISP-I is mc same iu the rocendy ideaoiteO 
Elm) gene. £7mJ is expressed In low, but not high, metastatic 
mouse melanoma cells, and suppresses the jn we- growth and 
metastatic potential of K-V73S mouse melanome colla (15). 
Human and mouse WlSP-2 are homologous to the recently 
described fat font, rCcp-I (16). Significant hom.olop (36- 
44%) was seen to the CCN family of growth factors, TMs family 
includes three members, CTGF, Cyr6l. and <M/m««*£ 
gene nov. CTGF Is a ehemotactie and miic-genie lae<or for 
fibroblasts that Is implicated in wound healing and tittotic 
disorders and U induced oy TOF-fl (17). Cyrol is «n extracel- 
lular matrix signaling molecule that memotes eell adhesion, 
proliferation, migraiion, angiogcnesls. and tumor Rrowlh (l», 
19) no* (nephroblastoma overexpreSSed) is. an immediate 
early gene associated «itn quiescence and found allered in 
Wilms tumors (20). Tho proteins of the CCN family share 
functional, but not sequence, similarity to Wm-V Ail we 
secreted, tystcinc-rich hoparin binding glycoproteins. Chat as- 
sociate with the cell surface and extracellular mat™ _ 

WISP proteins exhibit the modular architecture ot the cct* 
family, characterized by tour conserved cysteme-nch domains 
(Fig. IS) (21). 1 ne N-terminal domain, which includes the arst 
12 cysteino residues, contains a consensus sequence (GOGC- 
CXXC) conserved in most insulin-lto jrowth factor. (IGF)- 
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Ftd * (A) Encoded »mi«o aeid nqwne. alignment of human 
WTSPaTbeu^elne r«iduM ef WlSF-l aul WISM ttU art not 
oresent in WISP-3 arc indicattd wish » dot. <*) ^ehemadc rtpreseii- 
urion ef the WISP 9(0Uhm tb»«iog me dornaln struauie «d cnteiM 
residuM («• rtieai liaei). The toar ey$ Wine residua m tho VWC domei n 
diet «• ibsenl in WISP-1 «e indicated with » dot (0 B.pr»siaon ef 
WISP m&NA ia ttumau tissues. PCR pnxfarmed oa hum»a 
mulripie-astue cONA panels (CLOXTEai) frem the •oa.cateo aou'l 
and/ecal tlaua. 

binding proteins (BP). This sequenca Is conserved in WISM 
mi- vmr-3, wheieoa WISP-1 hat a gluiamlne in the third 
position Instead of a glyeina. CTCF recently has been shown 
to .specifically bind 1CF (22) and s trun^ttd nov protein 
tacking the-TOF-BP domain is oncogenic (23). The von Wil- 
lebrand factor type C module (VWCA also found a cetta.n 
eollagens "ind mucins, covers the neat 1<J cysteine residoas, and 
is thouehi' te> participaie in protein complex formation ai\a 
oliRomeriaation (2H). m* VWC domain of WISP-3 differs 
from all CCN family mcmoers described pievwualy, in that it 
contains only six of the 10 cysteine retiducs (Fit 3/ andif). 
A short variable r«Jon follows the vwc domain. The third 
module, tne thrombospondin (TSP) domain « Involved (n 
bincdng to sulfated glyeoeonjugaio. i and contains at ewbeiM 
residues and a conserved WSxCSiotCG moaf first .denttfied tn 
thrombospondin (M). The C-tenninal (CT) module contain- 
ing the remaining 10 cysteines it thought to be involved in 
dimeriaation and receptor binding (26). '0»e CT domain ;« 
present in all CCN family members described to date but if 
absent in WISP-2 (Fig. 3 A and B). The existence of a putanv. 
sianal sequence aod (he absence of a transmembrane donuun 
suKgest thai WISPs are secreted proteins, an observation 
supported by an analysis of their expression and secretion from 
mammalwn cell and bacutovirus cultures (data not shown). 

Expression or ynSP raRNA i» Hi«»en T.stuaa.1 issue, 
specific expression Of Human WlSPi *es eharactenised by f Ui 
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analysis on adutt ana rem multiple tissue eDNA panels. 
wisy-j expression was seen in the adult heart, kidney, lung, 
pancreas, placenta, cvary, null intestine, and spleen (Fig. 3Q. 
Little or no expression was detected in the brain, liver, skeletal 
muscle, colon, peripheral blood leukocytes, prostate, testis, or 
thymus. WISP-3 had a more restricted tissue expression firm 
was detected in adult skeletal muscle, colon, cvary, and fetal 
lim?. Predominant expression of WISP-3 was seen in adult 
kidney and testis and fetal kidney. Lower ievcle of WISP-3 
expression were detected in placenta, ovary, prostate, and 
small intestine. -, 
In Situ Localization of WISP-J aed WISP-3. Expression of 
WISP-1 and W1SP-2 was assessed by in situ hvbn'diiatiea in 
mammary tumon from Wnt-1 transgenic mice. Slrone expres- 
sion oiWi.V-1 was observed in stromal fibroblasts \/i»t within 
the Hbrovascutar tumor srroma (Pig. 4/1-0). However, low- 
level WISP-1 expression also was observed focaily within tumor 
cell* (data not shown). Mo expression was observed in normal 
breast. Like WlSP-1, WJSP-J expression also was seen In the 
tumor stroma in breast tumors from Wnl-1 transgenic snimali 
(Fig. 4 E-H\. However, WISP-3 expression m the stroma was 
in jpindla-sneped cells adjacent to capillary vessels, whereas 
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FlO. *. (fl.C.£, sne ^Representative hcnate«rliii/eeain<4tiiiiicd 
Images from breast tumors in Win- 1 transgenic mice. The correspond- 
ing dark-field imsgoj shewing WISP-1 expreuion er* shown in it end 
0. The tumor it a modorstely waU-dlfureAtialed »ae«aearejnoin> 
•Sowing tvidtnee of idertoid eyibe change. At law power (4J «). 
expression of W/5/*-/ fs seen in the delicate branchinc ribrovaseultr 
tumor stroma (arrowhead). Ar hijiier mapiincaiton. expression 1| seen 
in me scramal(t) Abrobluiti (C slid 0), and turner cells lye n»e«i*o 
Foeal exweajion of rfTJP-i, however, rat observed iri Bimqr ««Jl«-in 
lorr.e arcoi. Imtjrn or axprteiion art (hewn in Al low 

power (fi and I), exprenicn of W1SP.2 it teen in cells lyoig wirbin rfte 
fibrovMeular tumor slroms. At ligrter nocniticatlOl). these ceils 
appea/ec to be sajacent to capillary vessel! wtiertas ounor cells are 
nejattve (0 ana /■/)■ 



the predcmineni cell type expreaaicg WlSP-1 was th» stromal 
fibroblasts. 

Chromosome Localization of the WISP Genes. The chro- 
mosomal location of the human WISP genes wu determined 
by radiation hybrid mapping panclc. WISP-1 is approximately 

us cR from.me meiouc marker aPM259xc5 [logarithni of 
crtns (lod) score i M i ] on chromosome 8q24.l to 8q24.3. In tue 
same region as the human locus of the novH Oamily member 
(27) and roughly 4 Mbi distal to csnyc (28). Preliminary fine 
mapping indieatea that is located near D8S17X2 STS. 

. W&3. is linked to the marker SHGC-3.1922 (lod » 1,000) on 
; chromosome 20qI2-30ql3.1. Human WI3P~3 mapped to ehro- 
mosotn* 6e,22-Se;23 and is linkod to tho marker AfM211zeS 
(kd - 1,000). WISP-3 is approximately IS Mbs proximal to 
CTGF and 23 Mbs proximal to the human cellular oncogene 
J«W(27,»). 

AnpUflcoeion nod Aberrant Expitislon otWISPi in Human 
Colon Tumors. Amplification of nroiooncoeenes Is seen in 
many; human tumors and has etiolo; icid and prosnostic slg- 
nitlcahoe. for example, in a variety of tumor types, e-mye 
amplification ha$ oeen assoeia'erl with malignant progression 
and poor prognosis (30). Because W1SP-] resides in the same 
general' chromosomal location (Aq'^) *s c-rrtyc. we a&ked 
whether it Was a target oi gene amplification, and, if so. 
whether thit amplification wu independent of the e-mye locus. 
Genomic DNA from human colon cancer cell tines was 
assessed by quantitative PCR and Southern blot analysis. (Mg. 
SA and By Both methods detected similar degrees of WISP-1 
amplification. Most eell lines showed significant (2- to 4-fald) 
^amplifiettion,Vith the HT-29 and WiDr cell lines demonstrat- 
ine •ri'8-fold' inereeae. Signifioentiy, Iho pattern of amplifica- 
tiun; observed did net correlate with (hit ubsorved for c-mfe, 
indicating, that the e-mye gene is not pert of the amplioon that 
involves "the WtSJ*-l loenj. 

We next examined whether the WlbP genes were amplified 
In a panel of Vi primary human colon adenocarcinomas. The 
relative wisp gene copy numher In each colon tumor UNA 
wa« compared with pooled normal DNA from 10 donor* by 
quarititaijve /Cit (fig. 6). The copy number of WfSM and 
VtfSPtl was tignifieantly greater than one. approximately 
2-fold fcr WtSP-1 m about 60% of the tumors ant* >- to e-roifl 
ioi iVlSP-2 in 92% of the tumore (P «: 0.OU1 lor each), t he 
copy number for WlSP-3 wu indistinguishable Iron one (f = 
0.1b6)..fn addition, the copy number <>f WlUP-'J was signifi- 
cantly higher then chat of WISP-1 (P < 0.001). 

The levels at WISP transcripts in BNA iiiolatsd from 19 
adenocarcinoma] and their matched normal mucosa were 




FlO. 5. Amplificoiion otWISP.I genomic DNA in eaten eincer eell 
lines (A) Amplinceilon in cell line UNA uts determined njr quanti- 
Mii'ue PCR. (S) Southern blots caaisin'n<) cenomic DNA (in ul) 
0' jbled wito F-cc'H) (WISF-I) or Ahol (evnye) were hybriditcd w nh 
s 100-bp hueiim WlSF-l yrjbe (ami;io eodj 166—219) or ■ hueserj 
t-myc probe (leaicd at bp 1901-2000). Thu WiP and mjc {enei art 
do looted in normal hnrnin genomic DNA after a long*/ film expoiure. 
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Tumor Number 

F.C. & Genomic amplUiraiion of . Wtt? IjJ H«W" 
tumors. The relative jcue copy ««nbor of ^WiPjma , P 

DNA from primsry human lumerl with pOOIcn DNH 

donor.. The dale i» «*ans ± .SfiM AM 0« *"f " 

triplicate. The experiment w repelled at lost »w» time*. , , 

assessed by qoiWituWc PCR (FifrT). The level of WSf-i 
RNA present in tumor tissue varied but was signifieanjly 
increased (J- to >2WHdf) In M* (16/19) of the hurnaa colon 
tumors examined compared with normal adjacent mucosa. 
Four of 19 tumors showed greater than iCKold ovcrwprjj»icn 
In contr«st, in 79% (15/19) of the turno* eiammed . WWl 
UNA expression was significantly lower id the tumor, than tne 
nracots. Simltarto wfSP-1. MSP-3 RNA woe overexprewadm 
63»» (12/1«) oi - the colon rumors compared with thb normal 
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Fio. 7. H7Jf UNA. expression in primorv human oolon union 
relative lo expression in uctmal muecw from ihe lame patiem. 
Fjnrcsjluii uf WISP .0* in IP adenocarcinomas wa»;«S«re« By 
nuNiiitaiive PCR. The Dulu« .u g . of the lu/nor is listed under the 
?Mipl» number. Tn. dat* are means " 5RM Ctom u ( ie expor-menl 
dona in iriplicuie. me esperimeni was reported " l"»i 
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Bweosa..The amount of overexprcssion of WISP-3 ranged from 
4- to >iO-fol(l. 

DISCUSSION 

One approach to understanding ihe moleculer basis of cancer 
is to identify difference! in gene expression between cancer 
cells and normal cells. Strategies based on assumptions that 
steady-Jute mRNA levels will diflcr between normal and 
maUmant cells have been used 10 clone differentially ex- 
*Svm»Q1)- Wc bwe- *« * PCR-based selecrton 
strategy. MH. to identify u»nes wlcctivdy axwested m 
CS7MT. mouse mammary epithelial eelU irwisfbrmed by 

W? lbiee of the genes plated. WISP-t, WISP-3. and WW* 
ar» mattioers o( the CCN famir* of growth factors, which 
include* CfGF. Cyifi.1, andnov, afamilv nm previously Unked 
to Wnl signaling. . . 

Two independent experimental systems demonstrated that 
HOT/* induction was associated wuii the crorcssion of wnt-i. 
Tho : first was C57MO cells infected with a Wnt-1 retroviral 
vector or C57MG cells expressing *nt-l under the eonirol of 
a terraoyliaa-roprajsible promoter, and the second wm ui 
Wnt-1 transecnic mice, whera breast tissue expresses Wnt-1, 
whereas normal breast tissue deas not. No WISP (WA expres- 
sion was detected in mammary tumois Induced by polyoma 
virus middle T antigen (data not shown). Ihcse data surest 
a Uhlc between Wnt-1 and WSPs in dial in these two situations. 
wtSP Induction was correlated with Wnt-1 evpnaiion. 

li tS opt Clear whether the ff/ift are diractry or indirectly 
induced by the downstream eomponunw of the Wnt-1 signaling 
pathway tit: |S<etcnin-TCF-l/un). The increased levels of 
iWAf WMAwere measured in Wwl-trtnafoTmad cells, hoars 
or dayi after Wnt-1 mwfonnatfon. Thus, WISP expression 
sould result bom wnt-1 SitnallnH dirccdy through ^-catenin 
transcription factor regulation or alteinauTely through Wnt-1 
signaling turning on a transcription fsctoi, wluch ui lum 

■ ^Xht^ISpf^Bn* *» additional subfamily of the CCN family 
of erbweb factors. One striking djifcrcnce observed in the 
protein jeejuanee ofWlSP-2 is the 3t.sence Of a CT domair, 
WM* is pTeaent in CTGF. Cyrfl, sum, WISP-1. and WISP* 
This domain is thought to be involved in receptor binding and 
dimerization. Growth factors, each as 10F-/I. plaielet-denvcd 
growth factor, end norve growth faeior, which contain a cystme 
Knot-motif cut as dimera (31). It is tempting to speculate thiuT 
«VfS?-i arid WISP-S may exist as dimuri, wheteaa wisp-?. 
exists as a monomer. If the CT domuin is aUo importanl ftir 
receptor binding. WI3P-2 mcy bind its receptor through a 
ditfetent region of the moleeolc than the othar CO* famiry 
members. No spetflc receptors have hoen identiuedror UlQt 
or nov. A recent report has shown lh;\t iniej;nn ovft sarvet us 
an adbeiibn receptor fdr Cyrdl (33). 

Th* strong egression of WW-/ »nd WJSP-2 m cells iyin 6 
Wjuilh tiio fibrovaseular tumor stroma in breast tumors from 
Wnt-1 transgenic animals is consbienc with provioua obssr- 
vRtiona that transcripts tor the related CTGF gene are pn- 
matay expressed in the tibrous stroma of mamuiary rumerc 
(34V EpitheUal colli; arc though! to control the proliferation ol 
connective tissue stroma in marnmaiy tumors by s eascade of 
growth faetor aignaU similar to thru conrmlliug connective 
tissue formation during wound repair. U Has been proposed 
that mammary tumor cells or innammstory celb i ai j te mor 
interstitial interface secrete TGF-jSl, whielt is the stimulus /or 
stromal proliferation (34). TGF-Jt is swreted by a large 
i percentage or raaiieneni breast rumor, and may be one of he 
giowrh factors that stimulttej the production oi CI Oh antl 
WllSPs In ihe ^croina- 

'it was Of incercsi that WlSP-l end WISPJ expre«io,t was 
observed in the stromal ccUo that mnoundod the tumor colli 
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(epithelial cells) in the Wm-1 transgenic mouse tactions of 
breast titiue. This finding »unpb ih« P*"« ,n ° "g". 6 '^ 
could occur in -Rich the stromal cells could supply WIS and 
WISP-2 to regulate tumor cell growth on (he WISP e*"acel- 
lultr matrix. Stromal wlUcnved factors in the. e«rwcUula» 
matrix have been postulated » play a rale in turriorceU 
"? c «tion »nd proliferation (3S). The loe*r«rton-*»^/V 
and WSM in die stromal cells ofbreasl tumors supports this 

P *Ar^tlY«to d ofV/Sr»-J s 9 "" amplification and cxpreJSloain 
human colon tomora showed a correlation betweeq.DKA 
S"«doT, and overcmession, ^uwacx?^** 
WISP-3 R*A w M seen in tha absence of DNA aapl^otion. 
in contrvt. MSP-2 DNA *u amplified .a the color tumors 
but its mRNA expression was sfenificently reduced in the 
majority of rumors compared wituthe exorawion in normal 
colonic mucoia from me same patient Tho eene fox human 
WISP J <v»i localized to cliromnsomc 20ql 2-20ql3, at a region 
frequently amplified and associated with poor prc»no S is m 
node negative breast cancer and many colon cancers,, Newsl- 
ine the existence of oni or more OMOEencs at this locus 
(36-38) Beeauee che center of the 7flq\3 ampliccarhaa^o.lyel 
Ln identified, it ii possible that the apparent amplification 
observed for W1SP-2 may Be caused By another zene at this 

a, A l, r^ce*nt manuscript on rCop-f. the rat onhotogue crf 
MSP-2, describes the loss oppression of this gene after cell 
transformation, suggesting it may be a negative regulamr of 
gt<r*th in cell lines (16). Although the mecnan.tm hy^tch 
WlVf.2 RNA expression is down-regulated dunng qaKgnan 
transformation is unknown, the reduced expression .of JWWM 
in colon rumors and cell lines suggests thai n m>«« 
a tumor suppressor. These result* show that the WISP genei 
are aberrantly expressed In colon cancer and suacest that their 
altered expression may confer selective growth advantage te 
the tumor. ,. . ,. 

Members of the Wnl slenallng pathway hs« be*n. impli- 
cated in the pathogenesis of colon cancer. bicMt _eanc.Br; ana 
melanoma, including me tumor suppressoi gene adenomatous 
polyposis eoli'ind jB-eatanin (39). MutHUons in specific refjons 
of either gene can cause the stabilization and acwmulatKin ol 
cytoplasmic j3-catenin, which presumably wmrlbutes to hu- 
man cercinoBenesls through the actrvation of laf get genes such 
as the WISP*. Although the mechanism by which Wnt-1 
transforms cells and induces TumongeneSiS.u: utilcnpjvn. the 
Ittentificaiion of WISP* as genes that may be regulated down- 
stream of Wnt-1 in C57MG ceils suggests they ; could be 
important mediators of Wnt-1 transforrr.au on. The 
Don and altered expression patterns of tho WJfs. mjiurpao 
coion tumors may indicate an important rolo for the* genes 
ut tumor development. 

We HanK the DNA synthesis « roup for oliganud.olide ty^csU. T. 
Baker tOf technical nasisttnee, r. Dowd for red jhy tad trapping. 
K. Willert snd R. Kuw forth* Mt-ropf Miible LA7MO/ Wni>l X"S. V. 
Di*ir for dl«uetioni, uod D. Wood and A. Hruce tor artwork 
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Variable expression of the translocated c-abl oncogene in 
Philadelphia-chromosome-positive B-lymphoid cell lines 
from chronic myelogenous leukemia patients 

James B. Konopka**, Steven Clark*, Jami McLaughlin*, Masakuzu Nitta^, Yoshirq KatoT, 
Annabel Strife^ Bayard CiARKSONt, and Owen N. WittM** 

Avenue, New Yoifc, NY 10021 

Communicated by Michael Potter, February 10. 1986 

ABSTRACT The consistent cytogenetic translocation of 
chronic myelogenous leukemia (the Philadelphia chromosome, 
Ph 1 ) has been observed in cells of multiple hematopoietic 
lineages. This translocation creates a chimeric gene composed 
of breakpolnl-cluster-reglon {bcr) sequences from chromosome 
22 fused to. a portion of the abl oncogene on chromosome 9. The 
resulting gene product (P210 mM ) resembles the transforming 
protein of the Abelson murine leukemia virus In Its structure 
and tyrosine kinase activity. P210 c - W Is expressed to Ph r - 
positive cell lines of myeloid lineage and In clinical specimens 
with myeloid predominance. We show here that Epsteln-Barr 
virus-transformed B-lymphocyte lines that retain Ph 1 can 
express P21<r" M . The level of expression in these B-cell lines is 
generally lower and more variable than that observed for 
myeloid lines. Protein expression is not related to amplification 
of the oW gene but to variation in the level of bcr-abl mRNA 
produced from a single Ph 1 template. 

Chronic myelogenous leukemia (CML) is a disease of the 
pluripotent stem cell (1). In greater than 95% of patients, the 
leukemic cells contain the cytogenetic marker known as the 
Philadelphia chromosome, or Ph 1 (2). This reciprocal 
translocation event between the long arms of chromosomes 
9 and 22 has been used as a disease-specific marker for 
diagnosis and evaluation of therapy. Multiple hematopoietic 
lineages, including myeloid and B-lymphoid, contain Ph in 
early or chronic phase, as well as in the more acute accel- 
erated and blast crjsis phases of the disease. 

One molecular consequence of Ph 1 is the translocation of 
the chromosomal arm containing the c-abl gene on chromo- 
some 9 into the middle of the breakpoint-cluster region (bcr) 
gene on chromosome 22 (3-6). Although the precise 
translocation breakpoints are variable, an RNA-spkcing 
mechanism generates a very similar 8-kilobase (kb) mRNA * n 
each case (5-9). The hybrid bcr-abl message encodes a 
structurally altered form of the abl oncogene product, called 
P210 e "* M (10-13), with an amino-terininal segment derived 
from a portion of the exons of bcr on chromosome 22 and a 
carboxyl-tenninal segment derived from a major portion of 
the exohs of the c-abl gene on chromosome 9. The chimeric 
structure of bcr-abl and the resulting YIW M is similar to the 
structure of the Abelson murine leukemia virus gag-abl 
genome and resulting F160 , •* ,,1 transforming gene product. 
Both proteins have very s.imilar tyrosine kinase activities (10. 
11, 14) which can be distinguished by their relative stability 
to denaturing detergents and by their ATP requirements from 
the recently described tyrosine kinase activity of. the c-abl 
gene product (15). 

The publication costs of this article were defrayed in part by page charge 
payment. This article must therefore be hereby marked "advcrtittmtnt" 
in accordance with IB U.S.C. tl734-solely to indicate this fact. 



In concert with structural modification of the amino- 
terminal portion of the abl gene, increased level of expression 
has been implicated in activation of c-abl oncogenic poten- 
tial. Myeloid and erythroid cell lines and clinical samples 
derived from acute-phase CML patients contain about 10- 
fold higher levels of the 8-kb bcr-abl mRNA and P210 c * w than 
the c-abl mRNA forms (6 and 7 kb) and M?-* 1 gene product 
(5, 8, -9, Ml),! The higher level ofexpression of the chimeric 
bcr-abl message in acute-phase cells is not likely to be solely 
due to the presence of the bcr promoter sequences at the 5' 
end of the gene, since the normal 4.5-kb and 6.7-Jcb bcr- 
encoded mRNA species arc expressed at an even lower level 
than the normal c-abl messages (5, 6). 
. We have analyzed a series .or\Epstetn-Barr virus-irnmor- 
talized B-lymphoid cell lines derived from CML patients (16). 
With such in vitro clonal cell lines, we can evaluate whether 
the presence of Ph 1 always results in synthesis of the chimeric 
bcr-abl message and protein, and whether the quantitative 
expression varies for cells of B-lymphoid lineage as com- 
pared to previously examined myeloid cell lines. Our results 
Show that cell lines that retain Ph 1 do express bcr-abl message 
and protein, but that the level is generally lower and more 
variable than previously seen for myeloid cell lines. The 
demonstration that the Ph 1 chromosomal template can vary 
in its level ofexpression of ttVr** suggests that secondary 
mechanisms, beyond the translocation itself, contribute to 
the regulation of the bcr-abl gene in different cell types or 
subclones that derive from the affected stem cell. 

MATERIALS AND METHODS 

Cells and Cell Labeling!. Epstein-Barr virus-transformed 
B-lymphoid ceD lines were established from peripheral blood 
samples of chronic- and acute-phase CML patients as report' 
ed (16). The cell lines are designated according to patient 
number, karyotype, and lineage. For example, SK- 
CML7Bt(9,22)-33 refers to CML patient 7, B-lymphoid cell 
line, 9;22 translocation (Ph 1 ), cell line 33; and SK-CML7BN- 
2 refers to B-cell line 2 with a normal karyotype derived from 
the same patient. Repeat karyotype analysis was performed 
to verify the retention of Ph 1 just prior to analysis for abl 
protein and RNA. Cells were maintained in RPMI .1640 
medium with 20% fetal bovine serum. We have not observed 
any consistent pattern of in vitro growth rate that correlates 
to the stage of disease at the time of transformation with 
Epstein-Barr virus. Cells (1 .5 x 10 7 ) were washed twice with 
Dulb'ecco's modified Eagle's medium lacking phosphate and 



Abbreviations: bcr, breakpoint-cluster region; CML, chronic 
myelogenous.leukemla; kb, tilobasefs). m 
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supplemented with 5% dialyzed fetal bovine serum. Cells 
we^ then resuspended in 2 ml of the minimal medium. 
Ubetog was staged with the addition of ["PJorthophos- 
phate (1 -cnCi/ml; 1CN; 1 Ci - 37 GBq) and continued at 37 C 

f0 tam«SopredpitaUon and Immunoblottog.Jmmunoprecip- 
itations werecarried out as described (10) Cells (1.5 x if) 
were washed with phosphate-buffered sdint extracted 
with 3-5 ml of phosphate lysis buffer (1% Triton X-100/0.1 
NaDodSO</0.59& deoxycholate/10 raM Na 3 HP0 4 . pH 7.5/ 
l&Zl NaCl) with 5 mM EDTA and 5 mM P^nytaethyl- 
sulfpnyl fluoride. Extracts were clarified by centnfugatton 
and precipitated with normal or rabbi anu-abl sera (anti- 
pEX-2 or anti-pEX-5) (17). The precipitated prot ems ^were 
electrophoresed in a NaDodSO,/8% polyacrylamide gel. 
"P-labeled proteins were detected by autoradiography. 
Alternatively, abl proteins were detected by immunoblottuig. 
Extracts from unlabeled cells were clarified, and proteins 
were concentrated by immunoprecipitotion with rabbit anti- 
sera against oW-encoded proteins CanU-pEX-2 and anU-pEX- 
5 combined (11)] and then fractionated in 8% acrylamide gels, 
proteins were transferred from the gel to utncehibK 
filters, using protease-facilitated transfer (18). The abl- 
encoded proteins were detected using murine monoclonal 
antibodies as a probe and peroxidase-conjugated goat anti- 
Zxst second stage antibody (Bio-Rad) for development 
Rabbit antisera and mouse monoclonal antibodies to abl 
proteins were prepared using bacteriall y «Wj 
the v-aW protein as immunogens (17. 19). Anti-pEX-2 anti- 
bodies react with the internal tyrosine kinase domain and 
anti-pEX-5 antibodies react with the carboxyl-terminal seg- 
ment of the a W proteins. 

RNA Analysis. RNA was extracted from 10 s cells by he 
NaDodS04/urea/phenol method (20). Polyadenylylated 
RNA was purified by oligo(dT) affinity chromatography. 
Samples were electrophoresed in a 1% agarose/formalde- 
hyde ; gel and transferred to nitrocellulose, abl RNA spec.es 
were detected by hybridization with a nick-translated v-abl 

^rS^DNA was prepared from 5 x 10> cells of 
each ceU line and processed for Southern blots with a v-aW 
probe as described (21). 

RESULTS 

Variable Levels of V21F M Are Detected in Ph»-Positive Cell 
lines. Ph'-positive and Ph'-negative, Epstein-Barr virus- 
transformed B-lymphocyte cell lines derived from the same 
patient were examined for P210 < ^ bl synthesis by immuno- 
precipitation of [%]orthophospbate-labeled cell exfracts 
with anti-abl sera (Fig. 1). The normal c-aM protein P14J*- 
was detected at a similar level in multiple Ph l -posiUve and 
Ph l -negative ceU lines, nitf** was only detected in the 
Pb^positive cell lines because the bcr-abl chimeric gene 
which encodes P210 c "^ 1 resides on the Ph 1 (4, 5, 11, 13)- The 
level of P210 e * bl was about 4- to 5-fold higher than the level 
of PMS^ in the SK-CML7Bt-33 cell line (Fig. 1A, +). The 
Ptf-positive erythroid-progenitor cell line K562 (C) showed 
a level of P2l6 c - U * about 10-fold higher than P145 c - W . 
However, the level otmr" was about one-fifth that of 
P145 c -* 1 in the Ph l -positive SK-CML16BM ceU hne (Fig. IB, 
+) Comparison of different autoradiographic exposures 
roughly indicated that the level of P210™ varies over a 
20-fold range between these Ph l -positive B-ceU Unes. Anal- 
ysis of four additional Ph l -positive B-cell lines demonstrated 
that the level of P21(r ! -» b, fell into two general classes; some 
cell lines had a level of P210 c -^ 1 similar to SK-CML7Bt-33 
and others had the low level similar to SK-CML16Bt-l (Table 
1). This differs from previous studies with Ph'-positive 
myeloid cell lines and patient samples derived from acute- 
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Fio. 1. Detection of variable levels of P2W~" in Ph'-posiUve 
B-cell lines. Production of P145~" and P210~« in Epstem-Barr 
virus,tninsformed B-teU lines derived from a blast-cnsls ,(A) and la 
chronic-phase (B) CML patient was examined by metabobc labeUng 
with rnpiorthophosphate and immupoprecipitalion. Pn'-negauve 
(-) and Ph'-positive (+) cell Unes derived from each patient were 
Uyztd. TbePh'-negaUveceU line in A,- is SK-CML7BN-2 and ui 
B,- Is SK-CML16BN-1. The Ph'-positive cell lme in A,+ is 
SK-CML7Bt-J3 and in »J SK-CML16Bt-l. The KS62 cell Une, a 
Ph'-positive erythroid progenitor cell line .nontaneoudy derived 
frooia blast-crisis patient (33). Is represented in C. Cells i (1.5 x W) 
were metabolicaUy labeled with 2 mQ of ["PJorthpphosphate for 3-4 
hr and then were extracted and clarified by eentnfugation. Samples 
were Immunoprecipitated with control normal serum Oanes 1), 
anti-pEX-2 (lanes 2). of anU-pEX-5 (lanes 3) and analyzed by 
NaDods6 4 /8% PAGE foUowed by autoradiography with an inten- 
sifying screen (3 days for A and C, 10 days for B). 

phase CML patients, in which P210"" was detected _al : a 
10-fold higher level than ?US°"* (refs. 10 and 11; Table 1). 
There was no large difference in level of chimeric mRNA and 
¥210^ expressed in four myeloid/erythroid-lineage : Ph ■ 
positive cell lines (K562, EM2, EM3, CML22, and BV173; 
refs. 9 and 11), despite a 4- to 5-fold amplification of 
aW-related sequences in the K562 cell line. 

Detection of different levels of P21tr> bl in Fig. 1 could be 
due to decreased phosphorylation of P^IO 6 ^, a lower level 
of FIVf** synthesis, or altered stability of the protein, lo 
help distinguish among these possibilities, the steady-state 
level of PJIO 5 *" in the cell lines was assayed by immuno- 
blotting. The results show that SK-CML7BU33 (Fig. 2A, +) 
had a higher level of nVT** than P145, similar to the resulte 
with metabolic labeling (Fig. 1). We did not detect P210"* 
by immunoblotting with 2 X 10 7 cells of lme SK-CML8BI-3 
(Fig. 2B, +). Reconstruction experiments using dilutions or 
cell extracts showed that we could detect about 5-10% the 
level of P210°- ,u expressed in the K562 cell line (data not 
shown). We infer that the steady-state level ^ofKlO"" in 
SK-CML8Bt-3 is lower than the level in SK-CML7Bt-33 by 
a factor of at least 10. The level ofP210 c -» bl detected in these 
assays correlated with the amount of P2Kr' M tyrosine kinase 
activity that could be detected In vlnp (date not shown). 

Different Levels of P210""" Are Reflected In the Amount of 
Stable bcr-abl mRNA. To identify the basis for detection of 
variable levels of WIO 5 -", we examined the production of 
the abl RNA. RNA blot hybridization analysis using a v-abl 
■■ probe (Fig. 3) showed that the normal 6- and 7-Kb c-abl 
raRNAs were present at a similar level in Ph l -positive and 
-negative cell lines derived from different patients. However, 
the 8-kb mRNA that encodes P210 c ■ lb, was detected at a 
10-fold higher level in SK-CML7Bt-33 (Fig. 3A, +) than in 
SK-CML16BM (fl, +), which correlated with the relative 
level of P210 MW detected in each cell line. Analysis of 
additional cell lines demonstrated that the le vel of 8-kb RNA 
directly correlated with the level of YIVT* (Table 1). The 
variation in level of 8-kb RNA detected in these cell lines was 
not due to loss or gain of Ph 1 , because cytogenetic analysis 
confirmed the presence of Ph 1 in these cell lines (ref. 16 and 
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Table 1. Relative levels of ocr-oM expression in EpsteiorBarr 
virus-immort alized B-cell lines and myeloid CML lines 
~ ~ ~~ 8-kb 
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BVT73 
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+++++ 
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•Cefl lines derived from CML patients by transformation with 
Epstein-Barr virus as described (16). Names of cell Unes indicate 
patient number and Ph l status: SK-CML7Bt jndicates a cell line 
derived from patient 7 that carries the «2 Ptf trans tattoo; N 
indicates a normal karyotype. Myeloid-erythroid eeU Unes 0«62. 
EM2, and BV173) are described in previous publications (9, II, U, 

tStams of patient at the time cell line was derived. BC, blast crisis; 

Acc accelerated ph&sc* . . 

*Pre«nce (+) or absence {-) of Pb 1 as demonstrated by karyotypic 

or Southern blot analysis. 

»P210 c * bl detected as described in legend to Fig. 1. B-celi lines 
derived from blast-crisU and accelerated-phase patients had levels 
of P210 3- to 5-fold higher (+++) than levels of P145. Chronic- 
phase-derived cell Unes had P210 levels lower than or just equivalent 
HOtothe level of P145. Myeloid and erythroid lines had levels of 
P210 J- to 10-fold higher than P145 (+ ++++). 

VEight-kilobase 6cr-oW mRNA detected as described in legend to 
fT 2. Symbols: *. borderline detectable; .+++*.+. level of 8-kb 
mRNA 5- to 10-fold higher than that of the 6- and 7-kb c-oWmRNA 
Jpecies; +++, level of 8-kb mRNA 3- to 5-fold higher than that of 
theiS- and 7-kb species; + . a level approximately equivalent to that 
of the 6- and 7-kb messages. 

data not shown). There was no difference in the copy number 
of «6/-related sequences as judged by Southern blot analysis 
(Fig 4). Only the K562 cell line control showed an amplifi- 
cation of HW sequences, as previously reported (22, ,23). 
These combined data suggest that differential bcr-aW mRNA 
expression from a single gene template is responsible for the 
variable levels of P210 6 * w detected. This could be mediated 
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Fio. 2 Analysis of steady-stale abl protein levels by immuno- 
blotting. Cell extracts prepared from 2 x 10'^ eeUsof""" 5 K J 
CMUBN-2 (A.-). SK-CML7BI-33 (A.+), SK-CML8BN-10 (B,-) ( 
and SK-CMUBt-3 (B.+) were concentrated by immunoprecip- 
itationwithanti-pEX-2 plus anti-pEX-5. Samples were then electro- 
phoresed in a NaDodS0 4 /8% polyacrylamide gel and transferred to 
nitrocellulose, using protease-facilitated transfer (18). abl proteins 
were detected using a mixture of two monoclonal antibodies directed 
against the pEX-2 and pEX-5 aW-protein fragments produced m 
bacteria (19) as a probe and a peroxidase-conjugated goat anti-mouse 
second-stage antibody (Bio-Rad) for development. 




Fio. 3. Comparison of abl RNA levels In Ph'-positive and 
-negative B-cell lines. The levels of the normal 6- and 7-kb c-nM 
RNAs and the 8-kb bcr-abl RNA were analyzed by Wot hybridization 
using a v-abl probe. RNA was extracted from Ph'-negative lines 
SK-CML7BN-2 (A,-) and SK-CML16BN-1 (B-). from Prepos- 
itive lines SK-CMLoBt-33 (A.+) and SK-CMLloBt-3 (B.+), and 
from line K562 (C,+) by the NaDodSCVurea/phenol method (20). 
Polyadenylylated RNA was purified by oUgoidT) affinity chroma- 
tography, and 15 Mg of each sample was elect rophoresed In a 1% 
agarose/formaldehyde gel and then transferred to nitrocellulose. The 
blotted RNAs were hybridized with aiiickrtranslated v-o U fragment 
probe (21) and then autoradiographed for 4 days. 

by factors influencing the transcription rate of the bcr-abl 
gene or the stability, of the mRNA; 

DISCUSSION 

Several lines of evidence suggest that formation of Ph l is not 
• the primary event that affects the stem cell in CML. Patients 
have been identified that present with the clinical picture of 
CML but only later develop Ph 1 (1). This observation, 
coupled with studies of G6PD (glucose-6-phosphate dehy- 
drogenase)-heterozygous females with CML that demon- 
strate stem-cell clbnality by isozyme analysis among cell 
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Flo 4. Southern blot analysis of abl sequences In Ph'-poshive 
and -negative B-cell Unes. High molecular weight DNA (15 «J was 
digested with restriction endoauclease BamHl, separated rasMS 
agarose gel, and then transferred to nitrocellulose. The blotted DNA 
fragments were hybridized with a nick-translated, 2.4-kb Bgl H v-aM 
fragment (l.S x 10* cpm/jig; ref. 21) and exposed for 4 days. (A) 
Autoradiogram of aM*pecific fragments in cell lines HL-60 Jane 1), 
EM2 (lane 2) . K562 (lane 3), SK-CML7Bt-33 (lane 4), SK-CML8BW 
(lane 5). SK-CML16BM (hue «, SK-CML21BK Oane 7). SK- 
CML35BI-2 (lane 8). SK-CML7BN-2 flane 9), SK-CML8BN-2 (lane 
10), and SK-CML35BN-1 Oane 11). (B) Ethidium bromide stainmg of 
agarose gel prior to transfer to nitrocellulose, showing the level or 
variation in amount of DNA loaded per lane. 
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populations that lack the Ph l marker, supports a secondary 
or complementary role for Ph 1 in the progression of the 
disease (24. 25). This chromosome marker is found in 
chronic, accelerated, and blast-crisis phases of the disease, It 
is likely that Ph 1 confers some growth advantage, since ceus 
with the marker chromosome eventually predominate the 
marrow and peripheral blood even in chronic phase. During 
the phase of blast crisis, many patients develop additional 
chromosome abnormalities, including duplication of Ph, a 
variety of trisomies, and complex translocations (26). This 
is suggestive evidence for Ph 1 being a necessary but not 
sufficient genetic change for the full evolution of the 
disease. , . , 

The realization that one molecular result of Ph is the 
generation of a chimeric bcr-abl protein with functional 
characteristics and structure analogous to the gag-abl trans- 
forming protein of the Apelson murine leukemia virus 
strengthens the argument for an important role of Ph in the 
pathogenesis of CML. Although the Abelson virus is gener- 
ally considered a rapidly transforming retrovirus, its effects 
can range from overcoming growth factor requirements, to 
cellular lethality, to induction of highly oncogenic tumors in 
a number of hematopoietic cell lineages (27, 28). Even in the 
transformation of murine cell targets, there are several lines 
of evidence that suggest that the growm-promoting activity of 
the v-flW gene product is complemented by further cellular 
changes in the production of the malignant-cell phenotype 
(29-31). 

The regulation of bcr-abl gene expression is complex 
because the 5' end of the gene is derived from the non-^W 
sequences, bcr, normally found on chromosome 22 (6). The 
level of stable message for the normal bcr gene and the 
normal abl gene are both much lower than the level of the 
bcr-abl message and protein from cell lines and clinical 
specimens derived from myeloid blast-crisis patients (5, 6, 
11). Therefore, the high level of bcr-abl expression cannot 
simply be attributed to the regulatory sequences associated 
with bcr. Possibly, creation of the chimeric gene disrupts the 
normal regulatory sequences and results in a higher level of 
expression. Variation in bcr-abl expression may result from 
secondary changes in the structure of the chimeric gene or 
function of fro/w-acting factors that occur during evolution of 
the disease. Our analysis of P210 c " bl and the 8-kb mRNA in 
Epstein-Barr virus-transformed Ph^positive B-cell lines 
demonstrates that stable message and protein levels from the 
bcr-abl gene can vary over a wide range. This variation does 
not result from a change in the number of bcr-abl templates 
secondary to gene amplification but more likely from changes 
in either transcription rate or mRNA stability. We suspect 
this range of bcr-abl expression is not limited to lymphoid 
cells. Analysis of peripheral blood leukocytes derived from 
an unusual CML patient who has been in chronic phase with 
myeloid predominance for 16 years showed a level of 
niO*** one-fifth that of P145 Mbf , as detected by metabolic 
labeling with ["PJorthophosphate and immunoprecipitation 
(S.C., O.N.W., and P. Greenberg, unpublished observa- 
tions). Lower levels of expression of the. chimeric mRNA 
have been demonstrated in clinical samples from chronic- 
phase CML patients compared to acute-phase CML patients 
(9). Others have reported chronic-phase patients with vari- 
able but, iii some cases - relatively high levels of.the bcr-abl 
mRNA (32). The sampling variation and the heterogenous 
mixture of cell types in clinical samples complicate such 
analyses. Further work is needed to evaluate whether there 
is a defined change in P210 c " ,w expression during the pro- 
gression of CML. It is interesting to note that among the 
limited sample of Ph l -positive B-cell lines we have examined 
(Table 1), we have seen higher levels of P210 c " bl in those 
derived from patic v at more advanced stages of the disease. 
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It will be important to search for cell-type-specific mecha- 
nisms that might regulate expression of bcr-abl from Ph . 
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Proteome analysis: Biological assay or data archive? 

In this review we examine the current state of proteome analysis. There are 
three main issues discussed: why it is necessary to study proteornes; how pro- 
teornes can be analyzed with current technology; and how proteome analysis 
can be used to enhance biological research. We conclude that proteome anal- 
ysis is an essential tool in the understanding of regulated biological systems. 
Current technology, while still mostly limited to the more abundant proteins, 
enables the use of proteome analysis both to establish databases of proteins 
present, and to perform biological assays involving measurement of multiple 
variables. We believe that the utility of proteome analysis in future biological 
research will continue to be enhanced by further improvements in analytical 
technology. 
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1 Introduction 

A proteome has been defined as the protein complement 
expressed by the genome of an organism, or, in multicel- 
lular organisms, as the protein complement expressed by a 
tissue or differentiated cell [I]. In the most common im- 
plementation of proteome analysis the proteins extracted 
from the cell or tissue analyzed are separated by high 
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resolution two-dimensional gel electrophoresis (2-DE), 
detected in the gel and identified by their amino acid 
sequence. The ease, sensitivity and speed with which gel- 
separated proteins can be identified by the use of recently 
developed mass spectrometric techniques have dramati- 
cally increased the interest in proteome technology. One 
of the most attractive features of such analyses is that com- 
plex biological systems can potentially be studied in their 
entirety, rather than as a multitude of individual compo- 
nents. This makes it far easier to uncover the many com- 
plex, and often obscure, relationships between mature 
gene products in cells. Large-scale proteome characteriza- 
tion projects have been undertaken for a number of dif- 
ferent organisms and cell types. Microbial proteome pro- 
jects currently in progress include, for example: Saccharo- 
myces cerevisiae [2], Salmonella enterica [3], Spiroplasma 
melliferum [4], Mycobacterium tuberculosis [5], Ochrobac- 
trum anthropi [6], Haemophilus influenzae [7], Synecho- 
cystis spp. [8], Escherichia coli [9], Rhizobium legumino- 
sarum [10], and Dictyostelium discoideum [11]. Proteome 
projects underway for tissues of more complex organ- 
isms include those for: human bladder squamous cell 
carcinomas [12], human liver [13], human plasma [13], 
human keratinocytes [12], human fibroblasts [12], mouse 
kidney [12], and rat serum [14]. In this manuscript we cri- 
tically assess the concept of proteome analysis and the 
technical feasibility of establishing complete proteome 
maps, and discuss ways in which proteome analysis and 
biological research intersect. 

2 Rationale for proteome analysis 

The dramatic growth in both the number of genome 
projects and the speed with which genome sequences 
are being determined has generated huge amounts of 
sequence information, for some species even complete 
genomic sequences ([15-17]). The description of the 
state of a biological system by the quantitative measure- 
ment of system components has long been a primary 
objective in molecular biology. With recent technical 
advances including the development of differential dis- 
play-PCR [18], cDNA microarray and DNA chip techno- 
logy [19, 20] and serial analysis of gene expression 
(SAGE) [21, 22], it is now feasible to establish global and 
quantitative mRNA expression maps of cells and tissues, 
in which the sequence of all the genes is known, at a 
speed and sensitivity which is not matched by current 
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protein analysis technology. Given the long-standing 
paradigm in biology that DNA synthesizes RNA which 
synthesizes protein, and the ability to rapidly establish 
global, quantitative mRNA expression maps, the ques- 
tions which arise are why technically complex proteome 
projects should be undertaken and what specific types of 
information could be expected from proteome projects 
which cannot be obtained from genomic and transcript 
profiling projects. We see three main reasons for pro- 
teome analysis to become an essential component in the 
comprehensive analysis of biological systems, (i) Protein 
expression levels are not predictable from the mRNA 
expression levels, (ii) proteins are dynamically modified 
and processed in ways which are not necessarily 
apparent from the gene sequence, and (iii) proteomes 
are dynamic and reflect the state of a biological system. 

2.1 Correlation between mRNA and protein expression 
levels 

Interpretations of quantitative mRNA expression profiles 
frequently implicitly or explicitly assume that for specific 
genes the transcript levels are indicative of the levels of 
protein expression. As part of an ongoing study in our 
laboratory, we have determined the correlation of expres- 
sion at the mRNA and protein levels for a population of 
selected genes in the yeast Saccharomyces cerevisiae 
growing at mid-log phase (S. P. Gygi et al., submitted for 
publication). mRNA expression levels were calculated 
from published SAGE frequency tables [22]. Protein 
expression levels were quantified by metabolic radiola- 
beling of the yeast proteins, liquid scintillation counting 
of the protein spots separated by high resolution 2-DE 
and mass spectrometric identification of the protein(s) 
migrating to each spot. The selected 80 samples consti- 
tute a relatively homogeneous group with respect to pre- 
dicted half-life and expression level of the protein pro- 
ducts. Thus far, we have found a general trend but no 
strong correlation between protein and transcript levels 
(Fig. 1). For some genes studied equivalent mRNA trans- 
cript levels translated into protein abundances which 
varied by more than 50-fold. Similarly, equivalent steady- 
state protein expression levels were maintained by trans- 
cript levels varying by as much as 40-fold (S. P. Gygi 
et al., submitted). These results suggests that even for a 
population of genes predicted to be relatively homoge- 
neous with respect to protein half-life and gene expres- 
sion, the protein levels cannot be accurately predicted 
from the level of the corresponding mRNA transcript. 
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Figure 1. Correlation between mRNA and protein levels in yeast cells 
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2.2 Proteins are dynamically modified and processed 

In the mature, biologically active form many proteins are 
post-translationally modified by glycosylate, phosphor- 
ylation, prenylation, acylation, ubiquitination or one or 
more of many other modifications [23) and many pro- 
teins are only functional if specifically associated or com- 
plexed with other molecules, including DNA, RNA, pro- 
teins and organic and inorganic cofactors. Frequently, 
modifications are dynamic and reversible and may alter 
the precise three-dimensional structure and the state of 
activity of a protein. Collectively, the state of modifica- 
tion of the proteins which constitute a biological system 



are important indicators for the state of the system The 
type of protein modification and the sites modified at a 
specific cellular state can usually not be determined 
irom the gene sequence alone. 

23 Proteomes are dynamic and reflect the state of a 
biological system 

A single genome can give rise to many qualitatively and 
quantitatively different proteomes. Specific stages of the 
cell cycle and states of differentiation, responses to 
growth and nutrient conditions, temperature and stress 
and pathological conditions represent cellular states 
which are characterized by significantly different pro- 
teomes. The proteome, in principle, also reflects events 
that are under translational and post-translational con- 
trol. It is therefore expected that proteomics will be able 
to provide the most precise and detailed molecular des- 
cription of the state of a cell or tissue, provided that the 
external conditions defining the state are carefully deter- 
mined. In answer to the question of whether the study 
of proteomes is necessary for the analysis of biomolec- 
ular systems, it is evident that the analysis of mature pro- 
tein products in cells is essential as there are numerous 
levels of control of protein synthesis, degradation 
processing and modification, which are only apparent by 
direct protein analysis. 



3 Description and assessment of current proteome 
analysis technology 

3.1 Technical requirements of proteome technology 

In biological systems the level of expression as well as 
the states of modification, processing and macro-molec- 
ular association of proteins are controlled and modu- 
lated depending on the state of the system. Comprehen- 
sive analysis of the identity, quantity and state of modifi- 
cation of proteins therefore requires the detection and 
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quantitation of the proteins which constitute the system, 
and analysis of differentially processed forms. There are 
a number of inherent difficulties in protein analysis 
which complicate these tasks. First, proteins cannot be 
amplified. It is possible to produce large amounts of a 
particular protein by over-expression in specific cell sys- 
tems. However, since many proteins are dynamically 
post-translationally modified, they cannot be easily am- 
plified in the form in which they finally function in the 
biological system. It is frequently difficult to purify from 
the native source sufficient amounts of a protein for 
analysis. From a technological point of view this trans- 
lates into tbe need for high sensitivity analytical tech- 
niques. Second, many proteins are modified and pro- 
cessed post-translationally. Therefore, in addition to the 
protein identity, the structural basis for differentially 
modified isoforms also needs to be determined. The dis- 
tribution of a constant amount of protein over several 
differentially modified isoforms further reduces the 
amount of each species available for analysis. The com- 
plexity and dynamics of post-translational protein edit- 
ing thus significantly complicates proteome studies. 
Third, proteins vary dramatically with respect to their 
solubility in commonly used solvents. There are few, if 
any, solvent conditions in which all proteins are soluble 
and which are also compatible with protein analysis. This 
makes the development of protein purification methods 
particularly difficult since both protein purification and 
solubility have to be achieved under the same condi- 
tions. Detergents, in particular sodium dodecyl sulfate 
(SDS), are frequently added to aqueous solvents to 
maintain protein solubility. The compatibility with SDS 
is a big advantage of SDS polyacrylamide gel electro- 
phoresis (SDS-PAGE) over other protein separation 
" techniques. Thus, SDS-PAGE and two-dimensional gel 
electrophoresis, which also uses SDS and other deter- 
ments, are tbe most general and preferred methods for 
the purification of small amounts of proteins, provided 
that activity does not necessarily need to be maintained. 
Lastly, the number of proteins in a given cell system is 
typically in the thousands. Any attempt to identify and 
categorize all of these must use methods which are as 
rapid as possible to allow completion of the project 
within a reasonable time frame. Therefore, a successful, 
general proteomics technology requires high sensitivity, 
high throughput, the ability to differentiate differentially 
modified proteins, and the ability to quantitatively dis- 
play and analyze all the proteins present in a sample. 

3.2 2-D electrophoresis — mass spectrometry: a common 
implementation of proteome analysis 

The most common currently used implementation of 
proteome analysis technology is based on the separation 
of proteins by two-dimensional (IEF/SDS-PAGE) gel 
electrophoresis and their subsequent identification and 
analysis by mass spectrometry (MS) or tandem mass 
spectrometry (MS/MS). In 2-DE, proteins are first separ- 
ated by isoelectric focusing (IEF) and then by SDS- 
PAGE, in the second, perpendicular dimension. Separ- 
ated proteins are visualized at high sensitivity by staining 
or autoradiography, producing two-dimensional arrays of 
proteins. 2-DE gels are, at present, the most commonly 
used means of global display of proteins in complex 
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samples. The separation of thousands of proteins has 
been achieved in a single gel [24, 25] and differentially 
modified proteins are frequently separated. Due to the 
compatibility of 2-DE with high concentrations of deter- 
gents, protein denaturants and other additives promoting 
protein solubility, the technique is widely used. 

The second step of this type of proteome analysis is the 
identification and analysis of separated proteins. Individ- 
ual proteins from polyacrylamide gels have traditionally 
been identified using W-terminal sequencing [26, 27], 
internal peptide sequencing [28, 29], immunoblotting or 
comigration with known proteins [30]. The recent dra- 
matic growth of large-scale genomic and expressed 
sequence tag (EST) sequence databases has resulted in a 
fundamental change in the way proteins are identified by 
their amino acid sequence. Rather than by the traditional 
methods described above, protein sequences are now fre- 
quently determined by correlating mass spectral or 
tandem mass spectral data of peptides derived from pro- 
teins, with the information contained in sequence data- 
bases [31-33]. 

There are a number of alternative approaches to pro- 
teome analysis currently under development. There is 
considerable interest in developing a proteome analysis 
stragegy which bypasses 2-DE altogether, because it is 
considered a relatively slow and tedious process, and 
because of perceived difficulties in extracting proteins 
from the gel matrix for analysis. However, 2-DE as a 
starting point for proteome analysis has many advan- 
tages compared to other techniques available today. The 
most Significant strengths of the 2-DE-MS approach 
include the relatively uniform behavior of proteins in 
gels, the ability to quantify spots and the high resolution 
and simultaneous display of hundreds to thousands of 
proteins within a reasonable time frame. 

A schematic diagram of a typical procedure of the identi- 
fication of gel-separated proteins is shown in Fig. 2. Pro- 
tein spots detected in the gel are enzymatically or chemi- 
cally fragmented and the peptide fragments are isolated 
for analysis, as already indicated, most frequently by MS 
or MS/MS. There are numerous protocols for the gener- 
ation of peptide fragments from gel-separated proteins. 
They can be grouped into two categories, digestion in 
the gel slice [28, 34] or digestion after electrotransfer out 
of the gel onto a suitable membrane ([29, 35—37] and 
reviewed in [38]). In most instances either technique is 
applicable and yields good results. The analysis of MS or 
MS/MS data is an important step in the whole process 
because MS instruments can generate an enormous 
amount of information which cannot easily be managed 
manually. Recently, a number of groups have developed 
software systems dedicated to the use of peptide MS 
and MS/MS spectra for the identification of proteins. 
Proteins are identified by correlating the information 
contained in the MS spectra of protein digests or 
MS/MS spectra of individual peptides with data con- 
tained in DNA or protein sequence databases. 

The systems we are currently using in our laboratory are 
based on the separation of the peptides contained in pro- 
tein digests by narrow bore or capillary liquid chromatog- 
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Figure 2. Schematic diagram of a procedure for identification of gel- 
separaled proteins. Peptides can either be separated by a technique 
such as LC or CE, or infused as a mixture and sorted in the MS. Data- 
base searching can either be performed on peptide masses from an 
MS spectrum, peptide fragment masses from CID spectra of peptides, 
or a combination of both. 



raphy [39, 40] or capillary electrophoresis [41], the anal- 
ysis of the separated peptides by electrospray ioniza- 
tion (ESI) MS/MS, and the correlation of the generated 
peptide spectra with sequence databases using the 
SEQUEST program developed at the University of Wash- 
ington [32, 33]. The system automatically performs the 
following operations: a particular peptide ion character- 
ized by its mass-to-charge ratio is selected in the MS out 
of all the peptide ions present in the system at a parti- 
cular time; the selected peptide ion is collided in a colli- 
sion cell with argon (coHision-induced dissociation, 
CID) and the masses of the resulting fragment ions are 
determined in the second sector of the tandem MS; this 
experimentally determined CID spectrum is then corre- 
lated with the CID spectra predicted from all the pep- 
tides in a sequence database which have essentially the 
same mass as the peptide selected for CID; this correla- 
tion matches the isolated peptide with a sequence seg- 
ment in a database and thus identifies the protein from 
which the peptide was derived. There are a number of 
alternative programs which use peptide CID spectra for 
protein identification, but we use the SEQUEST system 
because it is currently the most highly automated pro- 
gram and has proven to be successful, versatile and 
robust. 



3.3 Protein identification by LC-MS/MS, capillary 
LC-MS/MS and CE-MS/MS 

It has been demonstrated repeatedly that MS has a very 
high intrinsic sensitivity. For the routine analysis of gel- 
separated proteins at high sensitivity, the most signif- 
icant challenge is the handling of small amounts of 
sample. The crux of the problem is the extraction and 
transferal of peptide mixtures generated by the digestion 
of low nanogram amounts of protein, from gels into the 
MS/MS system without significant loss of sample or 
introduction of unwanted contaminants. We employ 
three different systems for introducing gel-purified sam- 
ples into an MS, depending on the level of sensitivity 



required. As an approximate guideline, for samples con- 
taining tens of picomoles of peptides, LC-MS/MS is 
most appropriate; for samples containing low picomole 
amounts to high femtomole amounts we use capillary 
LC-MS/MS; and for samples containing femtomoles or 
less, CE-MS/MS is the method of choice. 

33.1 LC-MS/MS 

The coupling of an MS to an HPLC system using a 
0.5 mm diameter or bigger reverse phase (RP) column 
has been described in detail [42]. This system has several 
advantages if a large number of samples are to be ana- 
lyzed and all are available in sufficient quantity. The 
LC-MS and database searching program can be run in a 
fully automated mode using an autosampler, thus maxi- 
mizing sample throughput and minimizing the need for 
operator interference. The relatively large column is 
tolerant of high levels of impurities from either gel prep- 
aration or sample matrix. Lastly, if configured with a 
flow-splitter and micro-sprayer [40], analyses can be per- 
formed on a small fraction of the sample (less than 5%) 
while the remainder of the sample is recovered in very 
pure solvents. This latter feature is particularly useful 
when an orthogonal technique is also used to analyze 
peptide fractions, such as scintillation of an introduced 
radiolabel, and this data can be correlated with peptides 
identified by CID spectra. 

3.3.2 Capillary LC-MS 

An increase of sensitivity of approximately tenfold can be 
achieved by using a capillary LC system with a 100 urn ID 
column rather than a 0.5 mm ID column as referred to 
above. Since very low flow rates are required for such 
columns, most reports have used a precolumn flow split- 
ting system for producing solvent gradients. We have 
recently desribed the design and construction of a novel 
gradient mixing system which enables the formation 
of reproducible gradients at very low flow rates (low 
nL/min) without the need for flow splitting (A. Ducret 
et al., submitted for publication). Using this capillary 
LC-MS/MS system we were able to identify gel-separat- 
ed proteins if low picomole to high femtomole amounts 
were loaded onto the gel [40]. This system is as yet not 
automated and, like all capillary LC systems, is prone to 
blockage of the columns by microparticulates when ana- 
lyzing gel-separated proteins. 

3.3.3 CE-MS/MS 

The highest level of sensitivity for analyzing gel-sep- 
arated proteins can be achieved by using capillary elec- 
trophoresis - mass spectrometry (CE-MS). We have de- 
scribed in the past a solid-phase extraction capillary elec- 
trophoresis (SPE-CE) system which was used with triple 
quadrupole and ion trap ESI-MS/MS systems for the 
identification of proteins at the low femtomole to sub- 
femtomole sensitivity level (43, 44]. While this system is 
highly sensitive, its operation is labor-intensive and its 
operation has not been automated. In order to devise an 
analytical system with both the sensitivity of a CE and 
the level of automation of LC, we have constructed 




MS entrance 



Reservoir 2 



O.S cm of device = 



Reservoir 3 



12 cm capillary 
"PUMP" 



H 




Eltcmphoresii 1998, 19, 1862-1871 



Figure 3. Schematic illustration of a 
microfabricated analytical system Tor CE, 
consisting of a micromachined device, 
coated capillary electroosmotic pump, 
and microelectrospray interface. The 
dimensions of the channels and reservoir 
are as indicated in the text. The channels 
on the device were graphically enhanced 
to make them more visible. Reproduced 
from 145J, with permission. 



microfabricated devices for the introduction of samples 
into ESI-MS for high-sensitivity peptide analysis. 

The basic device is a piece of glass into which channels 
of 10—30 um in depth and 50—70 urn in diameter are 
etched by using photolithography/etching techniques 
similar to the ones used in the semiconductor industry. 
(A simple device is shown in Fig. 3). The channels are 
connected to an external high voltage power supply [45]. 
Samples are manipulated on the device and off the 
device to the MS by applying different potentials to the 
reservoirs. This creates a solvent flow by electroosmotic 
pumping which can be redirected by changing the posi- 
tion of the electrode. Therefore, without the need for 
valves or gates and without any external pumping, the 
flow can be redirected by simply switching the position 
of the electrodes on the device. The direction and rate of 
the flow can be modulated by the size and the polarity 
of the electric field applied and also by the charge state 
of the surface. 

The type of data generated by the system is illustrated in 
Fig. 4, which shows the mass spectrum of a peptide sample 
representing the tryptic digest of carbonic anhydrase at 
290 fmol/uL. Each numbered peak indicates a peptide suc- 
cessfully identified as being derived from carbonic an- 



hydrase. Some of the unassigned signals may be chemical 
or peptide contaminants. The MS is programmed to auto- 
matically select each peak and subject the peptide to CID. 
The resulting CID spectra are then used to identify the 
protein by correlation with sequence databases. Therefore, 
this system allows us to concurrently apply a number of 
protein digests onto the device, to sequentially mobilize 
the samples, to automatically generate CID spectra of 
selected peptide ions and to search sequence databases 
for protein identification. These steps are performed auto- 
matically without the need for user input and proteins can 
be identified at very low femtomole level sensitivity at a 
rate of approximately one protein per 15 min. 

3.4 Assessment of 2-DE-MS proteome technology 

Using a combination of the analytical techniques de- 
scribed above we have identified the 80 protein spots 
indicated in Fig. 5. The protein pattern was generated by 
separating a total of 40 microgram of protein contained 
in a total cell lysate of the yeast strain YPH499 by high 
resolution 2-DE and silver staining of the separated pro- 
teins. To estimate how far this type of proteome analysis 
can penetrate towards the identification of low abun- 
dance proteins, we have calculated the codon bias of the 
genes encoding the respective proteins. Codon bias is a 




m/z 



Figure 4. MS spectrum of a tryptic digest 
of carbonic anhydrase using the microfa- 
bricated system shown in Fig. 3. 290 
fmol/uL of carbonic anhydrase tryptic 
digest was infused into a Finnigan LCQ 
ion trap MS. Each peak was selected for 
CID, and those which were identified as 
containing peptides derived from car- 
bonic anhydrase are numbered. Repro- 
duced from (45], with permission. 
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Figure 5. 2-DE separation of a lysate of yeast cells, with identified proteins highlighted. The first dimension of separation was an IPO from 
pH 3-10, and the second dimension was a 10%T SDS-PAOE gel. Proteins were visualized by silver staining. Further details of experimental 
procedures are included in S. P. Gygi « at. (submitted). 



calculated measure of the degree of redundancy of trip- 
let DNA codons used to produce each amino acid in a 
particular gene sequence. It has been shown to be a 
useful indicator of the level of the protein product of a 
particular gene sequence present in a cell [46]. The gen- 
eral rule which applies is that the higher the value of the 
codon bias calculated for a gene, the more abundant the 
protein product of that gene becomes. The calculated 
codon bias values corresponding to the proteins identi- 
fied in Fig. 5 are shown in Fig. 6b. Nearly all of the pro- 
teins identified (> 95%) have codon bias values of > 0.2, 
indicating they are highly abundant in cells. In contrast, 
codon bias values calculated for the entire yeast genome 
(Fig. 6a) show that the majority of proteins present in 
the proteome have a codon bias of < 0.2 and are thus of 
low abundance. 

This finding is of considerable importance in our assess- 
ment of the current status of proteome analysis technol- 
ogy. It is clear that even using highly sensitive analytical 
techniques, we are only able to visualize and identify the 



more abundant proteins. Since many important regula- 
tory proteins are present only at low abundance* these 
would not be amenable to analysis using such tech- 
niques. This situation would be exacerbated in the anal- 
ysis of proteomes containing many more proteins than 
the approximately 6000 gene products present in yeast 
cells [16]. In the analysis of, for example, the proteome 
of any human cells, there are potentially 50000-100000 
gene products [47]. Inherent limitations on the amount 
of protein that can be loaded on 2-DE, and the number 
of components that can be resolved, indicate that only 
the most highly abundant fraction of the many gene 
products could be successfully analyzed. One approach 
that has been employed to circumvent these limitations 
is the use of very narrow range immobilized pH gradient 
strips for the first-dimension separation of 2-DE [48]. 
Since only those proteins which focus within the narrow 
range will enter the second dimension of separation, a 
much higher sample loading within the desired range is 
possible. This, in turn, can lead to the visualization and 
identification of less abundant proteins. 
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/Vgure tf. Calculated codon bias values for yeast proteins. (A) Distribu- 
tion of calculated values for the entire yeast proteome. (B) Distribu- 
tion of calculated values for the subset of 80 identified proteins also 
shown in Figs. 1 and 5. Further details of experimental procedures are 
included in S. P. Gygi el al. (submitted). 



4 Utility of proteome analysis for biological 
research 

For the success of proteomics as a mainstream approach 
to the analysis of biological systems it is essential to 
define how proteome analysis and biological research 
projects intersect. Without a clear plan for the implemen- 
tation of proteome-type approaches into biological re- 
search projects the full impact of the technology can not 
be realized. The literature indicates that proteome anal- 
ysis is used both as a database/data archive, and as a bio- 
logical assay or biological research tool. 

4.1 The proteome as a database 

The use of proteomics as a database or data archive 
essentially entails an attempt to identify all the proteins 
in a cell or species and to annotate each protein with the 
known biological information that is relevant for each 
protein. The level of annotation can, of course, be exten- 
sive. The most common implementation of this idea is 
the separation of proteins by high resolution 2-DE, the 
identification of each detected protein spot and the 
annotation of the protein spots in a 2-DE gel database 
format. This approach is complicated by the fact that it is 
difficult to precisely define a proteome and to decide 
which proteome should be represented in the database. 
In contrast to the genome of a species, which is essen- 
tially static, the proteome is highly dynamic. Processes 
such as differentiation, cell activation and disease can all 
significantly change the proteome of a species. This is 
illustrated in Fig. 7. The figure shows two high-resolu- 



tion 2-DE maps of proteins isolated from rat serum. 
Fig. 7A is from the serum of normal rats, while Fig. 7B 
is from the serum of rats in acute-phase serum after 
prior treatment with an inflammation-causing agent [49]. 
It is obvious that the protein patterns are significantly 
different in several areas, raising the question of exactly 
which proteome is being described. 

Therefore, a comprehensive proteome database of a spe- 
cies or cell type needs to contain all of the parameters 
which describe the state and the type of the cells from 
which the proteins were extracted as well as the software 
tools to search the database with queries which reflect 
the dynamics of biological systems. A comprehensive 
proteome database should be capable of quantitatively 
describing the fate of each protein if specific systems 
and pathways are activated in the cell. Specifically, the 
quantity, the degree of modification, the subcellular loca- 
tion and the nature of molecules specifically interacting 
with a protein as well as the rate of change of these 
variables should be described. Using these admittedly 
stringent criteria, there is currently no comlete proteome 
database. A number of such databases are, however, in 
the process of being constructed. The most advanced 
among them, in our opinion, are the yeast protein data- 
base YPD [50] (accessible at http://www.ypd.com) and 
the human 2D-PAGE databases of the Danish Centre 
for Human Genome Research [12] (accessible at http:// 
biobase.dk/cgi-bin/celis). While neither can be con- 
sidered complete as not all of the potential gene pro- 
ducts are identified, both contain extensive annotation 
of supplemental information for many of the spots 
which are positively identified in reference samples. 

4.2 The proteome as a biological assay 

The use of proteome analysis as a biological assay or 
research tool represents an alternative approach to inte- 
grating biology with proteomics. To investigate the state 
of a system, samples are subjected to a specific proceess 
that allows the quantitative or qualitative measurement 
of some of the variables which describe the system. In 
typical biochemical assays one variable (e.g., enzyme 
activity) of a single component (e.g., a particular en- 
zyme) is measured. Using proteomics as an assay, mul- 
tiple variables (e.g., expression level, rate of synthesis, 
phosphorylation state, etc.) are measured concurrently 
on many (ideally all) of the proteins in a sample. The 
use of proteomics as an assay is a less far-reaching prop- 
osition than the construction of a comprehensive pro- 
teome database. It does, however, represent a pragmatic 
approach which can be adapted to investigate specific 
systems and pathways, as long as the interpretation of 
the results takes into account that with current technol- 
ogy not all of the variables which describe the system 
can be observed (see Section 3.4). 

A common implementation of proteome analysis as a 
biological assay is when a 2-DE protein pattern gener- 
ated from the analysis of an experimental sample is 
compared to an array of reference patterns representing 
different states of the system under investigation. The 
state of the experimental system at the time the sample 
was generated is therefore determined by the quantita- 




live comparative analysis of hundreds to a few thousand 
proteins. Comparative analysis of the 2-DE patterns fur- 
thermore highlights quantitative and qualitative differ- 
ences in the protein profiles which correlate with the 
state of the system. For this type of analysis it is not 
essential that all the proteins are identified or even visu- 
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alized, although the results become more informative as 
more proteins are compared. It is obvious, however, that 
the possibility to identify any protein deemed character- 
istic for a particular state dramatically enhances this 
approach by opening up new avenues for experimenta- 
tion. 




Figure 7. High resolution 2-DE map of proteins isolated from rat serum with or without prior exposure to an inflam- 
mation-causing agent. (A) normal rat serum, (B) acute-phase serum from rats which had previously been exposed to 
an inflammation-causing agent. The first dimension of separation is an IPG from pH 4—10, and the second dimen- 
sion is a 7.5— 17.5%T gradient SDS-PAGE gel. Proteins were visualized by staining with amido black. Further details 
of experimental procedures are included in [14, 49). 
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Proteome analysis as a biological assay has been success- 
fully used in the field of toxicology, to characterize 
disease states or to study differential activation of cells. 
The approach is limited, of course, by the fact that only 
the visible protein spots are included in the assay, and it 
is well known that a substantial but far from complete 
fraction of cellular proteins are detected if a total cell 
lysate is. separated by 2-DE. Proteins may not be 
detected in 2-DE gels because they are not abundant 
enough to be visualized by the detection method used, 
because they do not migrate within the boundaries (size, 
pi) resolved by the gel, because they are not soluble 
under the conditions used, or for other reasons. 

A different way to use proteome analysis as a biological 
assay to define the state of a biological system is to take 
advantage of the wealth of information contained in 
2-DE protein patterns. 2-DE is referred to as two-dimen- 
sional because of the electrophoretic mobility and the 
isoelectric points which define the position of each pro- 
tein in a 2-DE pattern. In addition to the two dimen- 
sions used to generate the protein patterns, a number of 
additional data dimensions are contained in the protein 
patterns. Some of these dimensions such as protein 
expression level, phosphorylation state, subcellular loca- 
tion, association with other proteins, rate of synthesis or 
degradation indicate the activity state of a protein or a 
biological system. Comparative analysis of 2-DE protein 
patterns representing different states is therefore ideally 
suited for the detection, identification and analysis of 
suitable markers. Once again it must be emphasized that 
in this type of experiment only a fraction of the cellular 
proteins is analyzed. Since many regulatory proteins are 
of low abundance, this limitation is a concern, particu- 
larly in cases in which regulatory pathways are being 
investigated. 

5 Concluding remarks 

In this report we have addressed three main issues 
related to proteome analysis. First, we have discussed 
the rationale for studying proteomes. Second, we have 
assessed the technical feasibility of analyzing proteomes 
and described current proteome technology, and third, 
we have analyzed the utility of proteome analysis for bio- 
logical research. It is apparent that proteome analysis is 
an essential tool in the analysis of biological systems. 
The multi-level control of protein synthesis and degrada- 
tion in cells means that only the direct analysis of 
mature protein products can reveal their correct identi- 
ties, their relevant state of modification and/or associa- 
tion and their amounts. Recently developed methods 
have enabled the identification of proteins at ever- 
increasing sensitivity levels and at a high level of auto- 
mation of the analytical processes. A number of tech- 
nical challenges, however, remain. While it is currently 
possible to identify essentially any protein spots that can 
be visualized by common staining methods, it is ap- 
parent that without prior enrichment only a relatively 
small and highly selected population of long-lived, 
highly expressed proteins is observed. There are many 
more proteins in a given cell which are not visualized by 
such methods. Frequently it is the low abundance pro- 
teins that execute key regulatory functions. 
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We have outlined the two principal ways proteome anal- 
ysis is currently being used to intersect with biological 
research projects: the proteome as a database or data 
archive and proteome analysis as a biological assay. Both 
approaches have in common that at present they are con- 
ceptually and technically limited. Current proteome data- 
bases typically are limited to one cell type and one state 
of a cell and therefore do not account for the dynamics 
of biological systems. The use of proteome analysis as a 
biological assay can provide a wealth of information, but 
it is limited to the proteins detected and is therefore not 
truly proteome-wide. These limitations in proteomics are 
to a large extent a reflection of the fact that proteins in 
their fully processed form cannot easily be amplified and 
are therefore difficult to isolate in amounts sufficient for 
analysis or experimentation. Tne fact that to date no 
complete proteome has been described further attests to 
these difficulties. With continued rapid progress in pro- 
tein analysis technology, however, we anticipate that the 
goal of complete proteome analysis will eventually 
become attainable. 
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the test DNA molecule under conditions suitable for expression of the polypeptide, and (iii) recovering the 
polypeptide from the cell culture. 

In yet another embodiment, the invention concerns agonists and antagonists of a native PR01131 
polypeptide. In a particular embodiment, the agonist or antagonist is an anti-PROl 131 antibody. 

In a further embodiment, the invention concerns a method of identifying agonists or antagonists of a 
5 native PR01131 polypeptide, by contacting the native PR01131 polypeptide with a candidate molecule and 
monitoring a biological activity mediated by said polypeptide. 

In a still further embodiment, the invention concerns a composition comprising a PRO 1131 polypeptide , 
or an agonist or antagonist as hereinabove defined, in combination with a phannaceutically acceptable carrier. 

In another embodiment, the invention provides an expressed sequence tag (EST) designated herein as 
10 DNA43546 comprising the nucleotide sequence of Figure 231 (SEQ ID NO:320). 

99. PRQ1281 

A cDNA clone (DNA59820-1549) has been identified that encodes a novel secreted polypeptide 
designated in the present application as "PR01281". 

1 5 In one embodiment, the invention provides an isolated nucleic acid molecule comprising DNA encoding 

a PR01281 polypeptide. 

In one aspect, the isolated nucleic acid comprises DNA having at least about 80% sequence identity, 
preferably at least about 85% sequence identity, more preferably at least about 90% sequence identity, most 
preferably at least about 95% sequence identity to (a) a DNA molecule encoding a PR01281 polypeptide having 
20 the sequence of amino acid residues from about 16 to about 775, inclusive of Figure 233 (SEQ ID N0:326), or 
(b) the complement of the DNA molecule of (a). 

In another aspect, the invention concerns an isolated nucleic acid molecule encoding a PR01281 
polypeptide comprising DNA hybridizing to the complement of the nucleic acid between about residues 273 and 
about 2552, inclusive, of Figure 232 (SEQ ID NO:325). Preferably, hybridization occurs under stringent 
25 hybridization and wash conditions. 

In a further aspect, the invention concerns an isolated nucleic acid molecule comprising DNA having 
at least about 80% sequence identity, preferably at least about 85% sequence identity, more preferably at least 
about 90% sequence identity, most preferably at least about 95% sequence identity to (a) a DNA molecule 
encoding the same mature polypeptide encoded by the human protein cDNA in ATCC Deposit No. 203129 
30 (DNA59820-1549), or (b) the complement of the DNA molecule of (a). In a preferred embodiment, the nucleic 
acid comprises a DNA encoding the same mature polypeptide encoded by the human protein cDNA in ATCC 
Deposit No. 203129 (DNA59820-1549). 

In a still further aspect, the invention concerns an isolated nucleic acid molecule comprising (a) DNA 
encoding a polypeptide having at least about 80% sequence identity, preferably at least about 85% sequence 
35 identity, more preferably at least about 90% sequence identity, most preferably at least about 95% sequence 
identity to the sequence of amino acid residues from about 16 to about 775. inclusive of Figure 233 (SEQ ID 
NO:326), or the complement of the DNA of (a). 
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In a further aspect, the invention concerns an isolated nucleic acid molecule having at least about 50 
nucleotides, and preferably at least about 100 nucleotides and produced by hybridizing a test DNA molecule 
under stringent conditions with (a) a DNA molecule encoding a PR01281 polypeptide having the sequence of 
amino acid residues from about 16 to about 775, inclusive of Figure 233 (SEQ ID NO:326), or (b) the 
complement of the DNA molecule of (a), and, if the DNA molecule has at least about an 80% sequence identity, 
5 preferably at least about an 85 % sequence identity, more preferably at least about a 90% sequence identity, most 
preferably at least about a 95% sequence identity to (a) or (b), isolating the test DNA molecule. 

In a specific aspect, the invention provides an isolated nucleic acid molecule comprising DNA encoding 
a PR01281 polypeptide, with or without the N-terminal signal sequence and/or the initiating methionine, or is 
complementary to such encoding nucleic acid molecule. The signal peptide has been tentatively identified as 
10 extending from amino acid position 1 through about amino acid position 1 5 in the sequence of Figure 233 (SEQ 
IDNO:326). 

In another aspect, the invention concerns an isolated nucleic acid molecule comprising (a) DNA 
encoding a polypeptide scoring at least about 80% positives, preferably at least about 85% positives, more 
preferably at least about 90% positives, most preferably at least about 95% positives when compared with the 
15 amino acid sequence of residues 16 to about 775, inclusive of Figure 233 (SEQ ID NO:326), or (b) the 
complement of the DNA of (a). 

Another embodiment is directed to fragments of a PR01281 polypeptide coding sequence that may find 
use as hybridization probes. Such nucleic acid fragments are from about 20 to about 80 nucleotides in length, 
preferably from about 20 to about 60 nucleotides in length, more preferably from about 20 to about 50 
nucleotides in length, and most preferably from about 20 to about 40 nucleotides in length. 

In another embodiment, the invention provides isolated PR01281 polypeptide encoded by any of the 
isolated nucleic acid sequences hereinabove defined. 

In a specific aspect, the invention provides isolated native sequence PRO 1 281 polypeptide, which in one 
embodiment, includes an amino acid sequence comprising residues 16 to 775 of Figure 233 (SEQ ID NO:326). 

In another aspect, the invention concerns an isolated PR01281 polypeptide, comprising an amino acid 
sequence having at least about 80% sequence identity, preferably at least about 85% sequence identity, more 
preferably at least about 90% sequence identity, most preferably at least about 95% sequence identity to the 
sequence of amino acid residues 16 to about 775, inclusive of Figure 233 (SEQ ID NO.-326). 

In a further aspect, the invention concerns an isolated PR01281 polypeptide, comprising an amino acid 
sequence scoring at least about 80% positives, preferably at least about 85% positives, more preferably at least 
about 90% positives, most preferably at least about 95% positives when compared with the amino acid sequence 
of residues 16 to 775 of Figure 233 (SEQ ID NO:326). 

In yet another aspect, the invention concerns an isolated PRO 128 1 polypeptide, comprising the sequence 
of amino acid residues 16 to about 775, inclusive of Figure 233 (SEQ ID NO:326), or a fragment thereof 
sufficient to provide a binding site for an anti-PR0128l antibody. Preferably, the PR01281 fragment retains 
a qualitative biological activity of a native PR01281 polypeptide. 
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In a still farther aspect, the invention provides a polypeptide produced by (i) hybridizing a test DNA 
molecule under stringent conditions with (a) a DNA molecule encoding a PR01281 polypeptide having the 
sequence of amino acid residues from about 16 to about 775, inclusive of Figure 233 (SEQ ID NO:326), or (b) 
the complement of the DNA molecule of (a), and if the test DNA molecule has at least about an 80% sequence 
identity, preferably at least about an 85% sequence identity, more preferably at least about a 90% sequence 
5 identity, most preferably at least about a 95 % sequence identity to (a) or (b) , (ii) culturing a host cell comprising 
the test DNA molecule under conditions suitable for expression of the polypeptide, and (iii) recovering the 
polypeptide from the cell culture. 

100. PRO1064 

10 A cDNA clone (DNA59827-1426) has been identified that encodes a novel transmembrane polypeptide, 

designated in the present application as "PRO1064". 

In one embodiment, the invention provides an isolated nucleic acid molecule comprising DNA encoding 
a PRO1064 polypeptide. 

In one aspect, the isolated nucleic acid comprises DNA having at least about 80% sequence identity, 
15 preferably at least about 85% sequence identity, more preferably at least about 90% sequence identity, most 
preferably at least about 95% sequence identity to (a) a DNA molecule encoding a PRO1064 polypeptide having 
the sequence of amino acid residues from about 1 or about 25 to about 153, inclusive of Figure 235 (SEQ ID 
NO:334), or (b) the complement of the DNA molecule of (a). 

In another aspect, the invention concerns an isolated nucleic acid molecule encoding a PRO1064 
20 polypeptide comprising DNA hybridizing to the complement of the nucleic acid between about nucleotides 532 
or about 604 and about 990, inclusive, of Figure 234 (SEQ ID NO:333). Preferably, hybridization occurs under 
stringent hybridization and wash conditions. 

In a further aspect, the invention concerns an isolated nucleic acid molecule comprising DNA having 
at least about 80% sequence identity, preferably at least about 85% sequence identity, more preferably at least 
25 about 90% sequence identity, most preferably at least about 95% sequence identity to (a) a DNA molecule 
encoding the same mature polypeptide encoded by the human protein cDNA in ATCC Deposit No. 203089 
(DNA59827-1426) or (b) the complement of the nucleic acid molecule of (a). In a preferred embodiment, the 
nucleic acid comprises a DNA encoding the same mature polypeptide encoded by the human protein cDNA in 
ATCC Deposit No. 203089 (DNA59827-1426). 
30 In still a further aspect, the invention concerns an isolated nucleic acid molecule comprising (a) DNA 

encoding a polypeptide having at least about 80% sequence identity, preferably at least about 85% sequence 
identity, more preferably at least about 90% sequence identity, most preferably at least about 95% sequence 
identity to the sequence of amino acid residues 1 or about 25 to about 153, inclusive of Figure 235 (SEQ ID 
NO:334), or (b) the complement of the DNA of (a). 
35 In a further aspect, the invention concerns an isolated nucleic acid molecule having at least 10 

nucleotides and produced by hybridizing a test DNA molecule under stringent conditions with (a) a DNA 
molecule encoding a PRO 1064 polypeptide having the sequence of amino acid residues from 1 or about 25 to 
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an isolated PRO polypeptide nucleic acid molecule includes PRO polypeptide nucleic acid molecules contained 
in cells that ordinarily express the PRO polypeptide where, for example, the nucleic acid molecule is in a 
chromosomal location different from that of natural cells. 

The term "control sequences" refers to DNA sequences necessary for the expression of an operably 
linked coding sequence in a particular host organism. The control sequences that are suitable for prokaryotes, 
5 for example, include a promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells 
are known to utilize promoters, polyadenylation signals, and enhancers. 

Nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic 
acid sequence. For example, DNA for a presequence or secretory leader is operably linked to DNA for a 
polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide; a promoter or 
10 enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome 
binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation. Generally, 
"operably linked" means that the DNA sequences being linked are contiguous, and, in the case of a secretory 
leader, contiguous and in reading phase. However, enhancers do not have to be contiguous. Linking is 
accomplished by ligation at convenient restriction sites. If such sites do not exist, the synthetic oligonucleotide 
15 adaptors or linkers are used in accordance with conventional practice. 

The term "antibody" is used in the broadest sense and specifically covers, for example, single anti-PRO 
monoclonal antibodies (including agonist, antagonist, and neutralizing antibodies), anti-PRO antibody 
compositions with polyepitopic specificity, single chain anti-PRO antibodies, and fragments of anti-PRO 
antibodies (see below). The term "monoclonal antibody" as used herein refers to an antibody obtained from a 
20 population of substantially homogeneous antibodies, i.e. , the individual antibodies comprising the population are 
identical except for possible naturally-occurring mutations that may be present in minor amounts. 

"Stringency" of hybridization reactions is readily determinable by one of ordinary skill in the art, and 
generally is an empirical calculation dependent upon probe length, washing temperature, and salt concentration. 
In general, longer probes require higher temperatures for proper annealing, while shorter probes need lower 
25 temperatures. Hybridization generally depends on the ability of denatured DNA to reanneal when 
complementary strands are present in an environment below their melting temperature. The higher the degree 
of desired homology between the probe and hybridizable sequence, the higher the relative temperature which 
can be used. As a result, it follows that higher relative temperatures would tend to make the reaction conditions 
more stringent, while lower temperatures less so. For additional details and explanation of stringency of 
hybridization reactions, see Ausubel et al., Current Protocols in Molecular Biology. Wiley Interscience 
Publishers, (1995). 

"Stringent conditions" or "high stringency conditions", as defined herein, may be identified by those 
that: (1) employ low ionic strength and high temperature for washing, for example 0.015 M sodium 
chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 50°C; (2) employ during hybridization a 
35 denaturing agent, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum 
albumin/0. 1 % Ficoll/0. 1 % polyvinylpyrrolidone/50mMsodium phosphate buffer at pH 6.5 with 750 mM sodium 
chloride, 75 mM sodium citrate at 42°C; or (3) employ 50% formamide, 5 x SSC (0.75 M NaCl, 0.075 M 
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include agonist or antagonist antibodies or antibody fragments, fragments or amino acid sequence variants of 
native PRO polypeptides, peptides, small organic molecules, etc. Methods for identifying agonists or 
antagonists of a PRO polypeptide may comprise contacting a PRO polypeptide with a candidate agonist or 
antagonist molecule and measuring a detectable change in one or more biological activities normally associated 
with the PRO polypeptide. 

"Treatment" refers to both therapeutic treatment and prophylactic or preventative measures, wherein 
the object is to prevent or slow down (lessen) the targeted pathologic condition or disorder. Those in need of 
treatment include those already with the disorder as well as those prone to have the disorder or those in whom 
the disorder is to be prevented. 

"Chronic" administration refers to administration of the agent(s) in a continuous mode as opposed to 
an acute mode, so as to maintain the initial therapeutic effect (activity) for an extended period of time. 
"Intermittent" administration is treatment that is not consecutively done without interruption, but rather is cyclic 
in nature. 

"Mammal" for purposes of treatment refers to any animal classified as a mammal, including humans, 

domestic and farm animals, and zoo, sports, or pet animals, such as dogs, cats, cattle, horses, sheep, pigs, goats, 

rabbits, etc. Preferably, the mammal is human. 

Administration "in combination with" one or more further therapeutic agents includes simultaneous 

(concurrent) and consecutive administration in any order. 

"Carriers" as used herein include pharmaceutical^ acceptable carriers, excipients, or stabilizers which 

are nontoxic to the cell or mammal being exposed thereto at the dosages and concentrations employed. Often 
the physiologically acceptable carrier is an aqueous pH buffered solution. Examples of physiologically 
acceptable carriers include buffers such as phosphate, citrate, and other organic acids; antioxidants including 
ascorbic acid; low molecular weight (less than about 10 residues) polypeptide; proteins, such as serum albumin, 
gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, 
glutamine, asparagine, arginine or lysine; monosaccharides, disaccharides. and other carbohydrates including 
glucose, mannose, or dextrins; chelating agents such as EDTA; sugar alcohols such as mannitol or sorbitol; salt- 
forming counterions such as sodium; and/or nonionic surfactants such as TWEEN™, polyethylene glycol (PEG), 
and PLURONICS™. 

"Antibody fragments" comprise a portion of an intact antibody, preferably the antigen binding or 
variable region of the intact antibody. Examples of antibody fragments include Fab, Fab", F(ab') 2 , and Fv 
fragments; diabodies; linear antibodies (Zapata et al., Protein Ehp. 8(10): 1057-1062 [1995]); single-chain 
antibody molecules; and multispecific antibodies formed from antibody fragments. 

Papain digestion of antibodies produces two identical antigen-binding fragments, called "Fab" 
fragments, each with a single antigen-binding site, and a residual "Fc" fragment, a designation reflecting the 
ability to crystallize readily. Pepsin treatment yields an F(ab') 2 fragment that has two antigen-combining sites 
and is still capable of cross-linking antigen. 
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"Fv" is the minimum antibody fragment which contains a complete antigen-recognition and -binding 
site. This region consists of a dimer of one heavy- and one light-chain variable domain in tight, non-covalent 
association. It is in this configuration that the three CDRs of each variable domain interact to define an antigen- 
binding site on the surface of the VH-VL dimer. Collectively, the six CDRs confer antigen-binding specificity 
to the antibody. However, even a single variable domain (or half of an Fv comprising only three CDRs specific 
5 for an antigen) has the ability to recognize and bind antigen, although at a lower affinity than the entire binding 



The Fab fragment also contains the constant domain of the light chain and the first constant domain 
(CHI) of the heavy chain. Fab fragments differ from Fab' fragments by the addition of a few residues at the 
carboxy terminus of the heavy chain CHI domain including one or more cysteines from the antibody hinge 
10 region. Fab'-SH is the designation herein for Fab' in which the cysteine residue(s) of the constant domains bear 
a free thiol group. F(ab") 2 antibody fragments originally were produced as pairs of Fab' fragments which have 
hinge cysteines between them. Other chemical couplings of antibody fragments are also known. 

The "light chains" of antibodies (immunoglobulins) from any vertebrate species can be assigned to one 
of two clearly distinct types, called kappa and lambda, based on the amino acid sequences of their constant 
15 domains. 



Depending on the amino acid sequence of the constant domain of their heavy chains, immunoglobulins 



IgM, and several of these may be further divided into subclasses (isotypes), e.g., IgGl , IgG2, IgG3, IgG4, IgA, 
and IgA2. 

"Single-chain Fv" or "sFv" antibody fragments comprise the VH and VL domains of antibody, wherein 
these domains are present in a single polypeptide chain. Preferably, the Fv polypeptide further comprises a 
polypeptide linker between the VH and VL domains which enables the sFv to form the desired structure for 
antigen binding. For a review of sFv, see Pluckthun in The Pharmacology of Monoclonal Antibodies, vol. 1 13, 
Rosenburg and Moore eds., Springer- Verlag, New York, pp. 269-315 (1994). 

The term "diabodies" refers to small antibody fragments with two antigen-binding sites, which 
fragments comprise a heavy-chain variable domain (VH) connected to a light-chain variable domain (VL) in the 
same polypeptide chain (VH-VL). By using a linker that is too short to allow pairing between the two domains 
on the same chain, the domains are forced to pair with the complementary domains of another chain and create 
two antigen-binding sites. Diabodies are described more fully in, for example, EP 404,097; WO 93/1 1 161 ; and 
Hollinger et al., Proc. Natl. Acad. Sci. USA. 90:6444-6448 (1993). 

An "isolated" antibody is one which has been identified and separated and/or recovered, from a 
component of its natural environment. Contaminant components of its natural environment are materials which 
would interfere with diagnostic or therapeutic uses for the antibody, and may include enzymes, hormones, and 
other proteinaceous or nonproteinaceous solutes. In preferred embodiments, the antibody will be purified (1) 
to greater than 95% by weight of antibody as determined by the Lowry method, and most preferably more than 
99% by weight, (2) to a degree sufficient to obtain at least 15 residues of N-terminal or internal amino acid 
sequence by use of a spinning cup sequenator, or (3) to homogeneity by SDS-PAGE under reducing or 



site. 



can be assigned to different classes. There are five major classes of immunoglobulins: IgA, IgD, IgE, IgG, and 
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nonreducing conditions using Coomassie blue or, preferably, silver stain. Isolated antibody includes the antibody 
in situ within recombinant cells since at least one component of the antibody's natural environment will not be 
present. Ordinarily, however, isolated antibody will be prepared by at least one purification step. 

The word "label" when used herein refers to a detectable compound or composition which is conjugated 
directly or indirectly to the antibody so as to generate a "labeled" antibody. The label may be detectable by itself 
5 (e.g. radioisotope labels or fluorescent labels) or, in the case of an enzymatic label, may catalyze chemical 
alteration of a substrate compound or composition which is detectable. 

By "solid phase" is meant a non-aqueous matrix to which the antibody of the present invention can 
adhere. Examples of solid phases encompassed herein include those formed partially or entirely of glass (e.g., 
controlled pore glass), polysaccharides (e.g., agarose), polyacrylamides, polystyrene, polyvinyl alcohol and 
10 silicones. In certain embodiments, depending on the context, the solid phase can comprise the well of an assay 
plate; in others it is a purification column (e.g., an affinity chromatography column). This term also includes 
a discontinuous solid phase of discrete particles, such as those described in U.S. Patent No. 4,275,149. 

A "liposome" is a small vesicle composed of various types of lipids, phospholipids and/or surfactant 
which is useful for delivery of a drug (such as a PRO polypeptide or antibody thereto) to a mammal. The 
15 components of the liposome are commonly arranged in a bilayer formation, similar to the lipid arrangement of 
biological membranes. 

A "small molecule" is defined herein to have a molecular weight below about 500 Daltons. 

n. Compositions and Methods of the Invention 

20 The present invention provides newly identified and isolated nucleotide sequences encoding polypeptides 

referred to in the present application as PRO polypeptides. In particular, cDNAs encoding various PRO 
polypeptides have been identified and isolated, as disclosed in further detail in the Examples below. It is noted 
that proteins produced in separate expression rounds may be given different PRO numbers but the UNQ number 
is unique for any given DNA and the encoded protein, and will not be changed. However, for sake of 
25 simplicity, in the present specification the protein encoded by the full length native nucleic acid molecules 
disclosed herein as well as all further native homologues and variants included in the foregoing definition of 
PRO, will be referred to as "PRO/number", regardless of their origin or mode of preparation. 

As disclosed in the Examples below, various cDNA clones have been deposited with the ATCC. The 
actual nucleotide sequences of those clones can readily be determined by the skilled artisan by sequencing of the 
30 deposited clone using routine methods in the art. The predicted amino acid sequence can be determined from 
the nucleotide sequence using routine skill. For the PRO polypeptides and encoding nucleic acids described 
herein, Applicants have identified what is believed to be the reading frame best identifiable with the sequence 
information available at the time. 

35 1- Full-length PRQ281 Polyp eptides 

The present invention provides newly identifiedand isolated nucleotide sequences encoding polypeptides 
referred to in the present application as PR0281 (UNQ244). In particular, cDNA encoding a PR0281 
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Another means of increasing the number of carbohydrate moieties on the PRO polypeptide is by 
chemical or enzymatic coupling of glycosides to the polypeptide. Such methods are described in the art, e.g., 
in WO 87/05330 published 1 1 September 1987, and in Aplin and Wriston, CRC Crit. Rev. Biochem.. pp. 259- 
306 (1981). 

Removal of carbohydrate moieties present on the PRO polypeptide may be accomplished chemically 
5 or enzymatically or by mutational substitution of codons encoding for amino acid residues that serve as targets 
for glycosylation. Chemical deglycosylation techniques are known in the art and described, for instance, by 
Hakimuddin, et al., Arch. Biochem. Biophvs.. 259:52 (1987) and by Edge et al., Anal. Biochem. . 118:131 
(1981). Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a variety 
of endo- and exo-glycosidases as described by Thotakura et al., Meth. Enzvmol.. 138j350 (1987). 

10 Another type of covalent modification of PRO comprises linking the PRO polypeptide to one of a variety 

of nonproteinaceous polymers, e.g., polyethylene glycol (PEG), polypropylene glycol, or polyoxyalkylenes, in 
the manner set forth in U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791, 192 or 4, 179,337. 

The PRO of the present invention may also be modified in a way to form a chimeric molecule 
comprising PRO fused to another, heterologous polypeptide or amino acid sequence. 

IS In one embodiment, such a chimeric molecule comprises a fusion of the PRO with a tag polypeptide 

which provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is generally placed 
at the amino- or carboxyl- terminus of the PRO. The presence of such epitope-tagged forms of the PRO can be 
detected using an antibody against the tag polypeptide. Also, provision of the epitope tag enables the PRO to 
be readily purified by affinity purification using an anti-tag antibody or another type of affinity matrix that binds 

20 to the epitope tag. Various tag polypeptides and their respective antibodies aie well known in the art. Examples 
include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; the flu HA tag polypeptide and its 
antibody 12CA5 [Field et al., Mol. Cell. Biol.. 8:2159-2165 (1988)]; the c-myc tag and the 8F9, 3C7, 6E10, 
G4.B7 and 9E10 antibodies thereto [Evan etal., Molecular and Cellular Biology. 5:3610-3616 (1985)]; and the 
Herpes Simplex virus glycoprotein D (gD) tag and its antibody [Paborsky et al., Protein Engineering. 3(6): 547- 

25 553 (1990)]. Other tag polypeptides include the Flag-peptide [Hopp et al., BioTechnoloev. 6:1204-1210 
(1988)]; the KT3 epitope peptide [Martin et al., Science . 255:192-194 (1992)]; an a-tubulin epitope peptide 
[Skinner et al., J. Biol. Chem.. 266:15163-15166 (1991)]; and the T7 gene 10 protein peptide tag [Lutz- 
Freyermuth et al., Proc. Natl. Acad. Sci. USA. 87:6393-6397 (1990)]. 

In an alternative embodiment, the chimeric molecule may comprise a fusion of the PRO with an 

30 immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of the chimeric molecule (also 
referred to as an "immunoadhesin"), such a fusion could be to the Fc region of an IgG molecule. The Ig fusions 
preferably include the substitution of a soluble (transmembrane domain deleted or inactivated) form of a PRO 
polypeptide in place of at least one variable region within an Ig molecule. In a particularly preferred 
embodiment, the immunoglobulin fusion includes the hinge, CH2 and CH3, or the hinge, CH 1 , CH2 and CH3 

35 regions of an IgGl molecule. For the production of immunoglobulin fusions see also US Patent No. 5,428,130 
issued June 27, 1995. 
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D. Preparation of PRO 

The description below relates primarily to production of PRO by culturing cells transformed or 
transfected with a vector containing PRO nucleic acid. It is, of course, contemplated that alternative methods, 
which are well known in the art, may be employed to prepare PRO. For instance, the PRO sequence, or 
portions thereof, may be produced by direct peptide synthesis using solid-phase techniques [see, e.g., Stewart 
5 et al.. Solid-Phase Peptide Synthesis. W.H. Freeman Co., San Francisco, CA (1969); Merrifield, J. Am. Chem. 
Soc.. 85:2149-2154 (1963)]. In vitro protein synthesis may be performed using manual techniques or by 
automation. Automated synthesis may be accomplished, for instance, using an Applied Biosystems Peptide 
Synthesizer (Foster City, CA) using manufacturer's instructions. Various portions of the PRO may be 
chemically synthesized separately and combined using chemical or enzymatic methods to produce the full-length 
10 PRO. 



1. Isolation of DNA Encoding PRO 
DNA encoding PRO may be obtained from a cDNA library prepared from tissue believed to possess 
the PRO mRNA and to express it at a detectable level. Accordingly, human PRO DNA can be conveniently 
15 obtained from a cDNA library prepared from human tissue, such as described in the Examples. The PRO- 
encoding gene may also be obtained from a genomic library or by known synthetic procedures (e.g., automated 
nucleic acid synthesis). 

Libraries can be screened with probes (such as antibodies to the PRO or oligonucleotides of at least 
about 20-80 bases) designed to identify the gene of interest or the protein encoded by it. Screening the cDNA 
or genomic library with the selected probe may be conducted using standard procedures, such as described in 
Sambrook et al., Molecular Cloning: A Labo ratory Manual (New York: Cold Spring Harbor Laboratory Press, 
1989). An alternative means to isolate the gene encoding PRO is to use PCR methodology [Sambrook et al., 
supra; Dieffenbach et al., PCR Primer: A Laboratory Manual (Cold Spring Harbor Laboratory Press, 1995)]. 

The Examples below describe techniques for screening a cDNA library. The oligonucleotide sequences 
selected as probes should be of sufficient length and sufficiently unambiguous that false positives are minimized. 
The oligonucleotide is preferably labeled such that it can be detected upon hybridization to DNA in the library 
being screened. Methods of labeling are well known in the art, and include the use of radiolabels like 3J P-labeled 
ATP, biotinylation or enzyme labeling. Hybridization conditions, including moderate stringency and high 
stringency, are provided in Sambrook et al., supra. 

Sequences identified in such library screening methods can be compared and aligned to other known 
sequences deposited and available in public databases such as GenBank or other private sequence databases. 
Sequence identity (at either the amino acid or nucleotide level) within defined regions of the molecule or across 
the full-length sequence can be determined using methods known in the an and as described herein. 

Nucleic acid having protein coding sequence may be obtained by screening selected cDNA or genomic 
libraries using the deduced amino acid sequence disclosed herein for the first time, and, if necessary, using 
conventional primer extension procedures as described in Sambrook et al., supra , to detect precursors and 
processing intermediates of mRNA that may not have been reverse-transcribed into cDNA. 
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2. Selection and Transformation of Host Cells 
Host cells are transfected or transformed with expression or cloning vectors described herein for PRO 
production and cultured in conventional nutrient media modified as appropriate for inducing promoters, selecting 



transformants, or amplifying the genes encoding the desired sequences. The culture conditions, such as media, 
temperature, pH and the like, can be selected by the skilled artisan without undue experimentation. In general, 
5 principles, protocols, and practical techniques for maximizing the productivity of cell cultures can be found in 
Mammalian Cell Biotechnology: a Practical Approach. M. Butler, ed. (IRL Press, 1991) and Sambrook et al., 
supra . 

Methods of eukaryotic cell transfection and prokaryotic cell transformation are known to the ordinarily 
skilled artisan, for example, CaCl 2 , CaP0 4 , liposome-mediated and electroporation. Depending on the host cell 
10 used, transformation is performed using standard techniques appropriate to such cells. The calcium treatment 
employing calcium chloride, as described in Sambrook et al., supra , or electroporation is generally used for 
prokaryotes. Infection with Agrobacterium tumefaciens is used for transformation of certain plant cells, as 
described by Shaw etal., Gene, 23:315(1983) and WO 89/05859 published 29 June 1989. For mammalian cells 
without such cell walls, the calcium phosphate precipitation method of Graham and van der Eb, Virology. 
15 52:456-457 (1978) can be employed. General aspects of mammalian cell host system transfections have been 
described in U.S. Patent No. 4,399,216. Transformations into yeast are typically carried out according to the 
method of VanSolingen etal., J. Bact. . 130:946 (1977) and Hsiao etal., Proc. Natl. Acad. Sci. OJSAl 76:3829 
(1979). However, other methods for introducing DNA into cells, such as by nuclear microinjection, 
electroporation, bacterial protoplast fusion with intact cells, or polycations, e.g. , polybrene, polyornithine, may 
20 also be used. For various techniques for transforming mammalian cells, see Keown et al., Methods in 
Enzvmology. 185:527-537 (1990) and Mansour et al., Nature. 336:348-352 (1988). 

Suitable host cells for cloning or expressing the DNA in the vectors herein include prokaryote, yeast, 
or higher eukaryote cells. Suitable prokaryotes include but are not limited to eubacteria, such as Gram-negative 
or Gram-positive organisms, for example, Enterobacteriaceae such as E. coli. Various E. coli strains are 
25 publicly available, such as E. coli K12 strain MM294 (ATCC 31 ,446); E. coli X1776 (ATCC 31,537); E. coli 
strain W3110 (ATCC 27,325) and K5 772 (ATCC 53,635). Other suitable prokaryotic host cells include 
Enterobacteriaceae such as Escherichia, e.g., E. coli, Enterobacter, Envinia, Klebsiella, Proteus, Salmonella, 
e.g., Salmonella typhimurium, Serratia, e.g., Serratia marcescans, and Shigella, as well as Bacilli such as B. 
subtilis and B. licheniformis (e.g., B. Ucheniformis 41P disclosed in DD 266,710 published 12 April 1989), 
30 Pseudomonas such as P. aeruginosa, and Streptomyces . These examples are illustrative rather than limiting. 
Strain W3 1 10 is one particularly preferred host or parent host because it is a common host strain for recombinant 
DNA product fermentations. Preferably, the host cell secretes minimal amounts of proteolytic enzymes. For 
example, strain W31 10 may be modified to effect a genetic mutation in the genes encoding proteins endogenous 
to the host, with examples of such hosts including E. coli W31 10 strain 1 A2, which has the complete genotype 
tonA ; E. coli W3110 strain 9E4, which has the complete genotype tonA ptr3; E. coli W3110 strain 27C7 
(ATCC 55,244), which has the complete genotype tonA ptr3 phoA El 5 (argF-lac)169degP ompTkaif; E. coli 
W31 10 strain 37D6, which has the complete genotype tonA ptr3 phoA El 5 (argF-lac)169 degP ompT rbs7 
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ilvG karf ; E. coli W3110 strain 40B4, which is strain 37D6 with a non-kanamycin resistant degP deletion 
mutation; and an E. coli strainhaving mutant periplasmic protease disclosed in U.S. Patent No. 4,946,783 issued 
7 August 1990. Alternatively, in vitro methods of cloning, e.g., PCR or other nucleic acid polymerase reactions, 
are suitable. 

In addition to prokaryotes, eukaryotic microbes such as filamentous fungi or yeast are suitable cloning 
5 or expression hosts for PRO-encoding vectors. Saccharomyces cerevisiae is a commonly used lower eukaryotic 
host microorganism. Others include Schizosaccharomyces pombe (Beach and Nurse, Nature . 290: 140 [1981]; 
EP 139,383 published 2 May 1985); Kluyveromyces hosts (U.S. Patent No. 4,943,529; Fleer et al., 
Bio/Technology. 9:968-975 (1991)) such as, e.g., tf. lactis (MW98-8C, CBS683, CBS4574; Louvencourt et al., 
J. Bacteriol. . 737 [1983]), K.fragilis (ATCC 12,424), K. bulgaricus (ATCC 16,045), K.wickeramii (ATCC 
10 24,178), K. waltii (ATCC 56,500), K. drosophilarum (ATCC 36.906; Van den Berg et al., Bio/Technology. 
8:135 (1990)), K. thermotolerans, and K. marxianus; yarrowia (EP 402,226); Pichia pastoris (EP 183,070; 
Sreekrishna et al., J. Basic Microbiol .. 28:265-278 [1988]); Candida; Trichoderma reesia (EP 244,234); 
Neurospora crassa (Case et al., Proc. Natl. Acad. Sci. USA . 76:5259-5263 [1979]); Schwanniomyces such as 
Schwanniomyces occidentals (EP 394,538 published 31 October 1990); and filamentous fungi such as, e.g., 
15 Neurospora, Penicillium, Totypocladium (WO 91/00357 published 10 January 1991), and Aspergillus hosts such 
as A. nidulans (Ballance et al., Biochem. Biophvs. Res. Commun.. 112:284-289 [1983]; Tilburn et al., Gene . 
26:205-221 [1983]; Yeltonetal., Proc. Natl. Acad. Sci. USA . 81: 1470-1474 [1984]) and A. niger (Kelly and 
Hynes, EMBO J.. 4:475-479 [1985]). Methylotropic yeasts are suitable herein and include, but are not limited 
to, yeast capable of growth on methanol selected from the genera consisting of Hansenula, Candida, Kloeckera, 
20 Pichia, Saccharomyces, Torulopsis, and Rhodotorula. A list of specific species that are exemplary of this class 
of yeasts may be found in C. Anthony, The Biochemistry of Methvlotrophs. 269 (1982). 

Suitable host cells for the expression of glycosylated PRO are derived from multicellular organisms. 
Examples of invertebrate cells include insect cells such as Drosophila S2 and Spodoptera Sf9, as well as plant 
cells. Examples of useful mammalian host cell lines include Chinese hamster ovary (CHO) and COS cells. 
25 More specific examples include monkey kidney CV1 line transformed by SV40 (COS-7, ATCC CRL 1651); 
human embryonic kidney line (293 or 293 cells subcloned for growth in suspension culture, Graham et al., J._ 
Gen Virol., 36:59 (1977)); Chinese hamster ovary cells/-DHFR (CHO, Urlaub and Chasin, Proc. Natl. Acad. 
Sci. USA, 77:4216 (1980)); mouse Sertoli cells (TM4, Mather, Biol. Reprod.. 23:243-251 (1980)); human lung 
cells (W138, ATCC CCL 75); human liver cells (Hep G2, HB 8065); and mouse mammary tumor (MMT 
30 060562, ATCC CCL51). The selection of the appropriate host cell is deemed to be within the skill in the art. 

3. Selection and Use of a Replicable Vector 
The nucleic acid (e.g., cDNA or genomic DNA) encoding PRO may be inserted into a replicable vector 
for cloning (amplification of the DNA) or for expression. Various vectors are publicly available. The vector 
55 may, for example, be in the form of a plasmid, cosmid, viral particle, or phage. The appropriate nucleic acid 
sequence may be inserted into the vector by a variety of procedures. In general, DNA is inserted into an 
appropriate restriction endonuclease site(s) using techniques known in the art. Vector components generally 
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include, but are not limited to, one or more of a signal sequence, an origin of replication, one or more marker 
genes, an enhancer element, a promoter, and a transcription termination sequence. Construction of suitable 
vectors containing one or more of these components employs standard ligation techniques which are known to 
the skilled artisan. 

The PRO may be produced recombinantly not only directly, but also as a fusion polypeptide with a 
5 heterologous polypeptide, which may be a signal sequence or other polypeptide having a specific cleavage site 
at the N-terminus of the mature protein or polypeptide. In general, the signal sequence may be a component of 
the vector, or it may be a part of the PRO-encoding DNA that is inserted into the vector. The signal sequence 
may be a prokaryotic signal sequence selected, for example, from the group of the alkaline phosphatase, 
penicillinase, lpp, or heat-stable enterotoxin II leaders. For yeast secretion the signal sequence may be, e.g., 

10 the yeast invertase leader, alpha factor leader (including Saccharomyces and Kluyveromyces a-factor leaders, 
the latter described in U.S. Patent No. 5,010,182), or acid phosphatase leader, the C. albicans glucoamylase 
leader (EP 362,179 published 4 April 1990), or the signal described in WO 90/13646 published 15 November 
1990. In mammalian cell expression, mammalian signal sequences may be used to direct secretion of the 
protein, such as signal sequences from secreted polypeptides of the same or related species, as well as viral 

15 secretory leaders. 

Both expression and cloning vectors contain a nucleic acid sequence that enables the vector to replicate 
in one or more selected host cells. Such sequences are well known for a variety of bacteria, yeast, and viruses. 
The origin of replication from the plasmid pBR322 is suitable for most Gram-negative bacteria, the 2 M plasmid 
origin is suitable for yeast, and various viral origins (SV40, polyoma, adenovirus, VSV or BPV) are useful for 
20 cloning vectors in mammalian cells. 

Expression and cloning vectors will typically contain a selection gene, also termed a selectable marker. 
Typical selection genes encode proteins that (a) confer resistance to antibiotics or other toxins, e.g., ampicillin, 
neomycin, methotrexate, or tetracycline, (b) complement auxotrophic deficiencies, or (c) supply critical nutrients 
not available from complex media, e.g., the gene encoding D-alanine racemase for Bacilli. 
25 An example of suitable selectable markers for mammalian cells are those that enable the identification 

of cells competent to take up the PRO-encoding nucleic acid, such as DHFR or thymidine kinase. An 
appropriate host cell when wild-type DHFR is employed is the CHO cell line deficient in DHFR activity, 
prepared and propagated as described by Urlaub et al., Proc. Natl. Acad. Sci. USA , 77:4216 (1980). A suitable 
selection gene for use in yeast is the irp\ gene present in the yeast plasmid YRp7 [Stinchcomb et al., Nature . 
30 282:39 (1979); Kingsmanet al., Gene, 7:141 (1979); Tschemper et al., Gene, 10:157 (1980)]. The trp\ gene 
provides a selection marker for a mutant strain of yeast lacking the ability to grow in tryptophan, for example, 
ATCC No. 44076 or PEP4-1 [Jones, Genetics. 85:12 (1977)]. 

Expression and cloning vectors usually contain a promoter operably linked to the PRO-encoding nucleic 
acid sequence to direct mRNA synthesis. Promoters recognized by a variety of potential host cells are well 
35 known. Promoters suitable for use with prokaryotic hosts include the p-lactamase and lactose promoter systems 
[Chang et al., Nature, 275:615 (1978); Goeddel et al., Nature . 281:544 (1979)], alkaline phosphatase, a 
tryptophan (tip) promoter system [Goeddel, Nucleic Acids Res, 8:4057 (1980); EP 36,776], and hybrid 
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promoters such as the tac promoter [deBoer et al., Proc. Natl. Acad. Sci. USA. 80:21-25 (1983)]. Promoters 
for use in bacterial systems also will contain a Shine-Dalgamo (S.D.) sequence operably linked to the DNA 
encoding PRO. 

Examples of suitable promoting sequences for use with yeast hosts include the promoters for 3- 
phosphoglycerate kinase [Hitzeman et al., J. Biol. Chem. . 255:2073 (1980)] or other glycolytic enzymes [Hess 
et al., J. Adv. Enzyme Reg.. 7:149 (1968); Holland, Biochemistry. 17:4900 (1978)], such as enolase, 
glyceraldehyde-3-phosphate dehydrogenase, hexokinase.pyruvatedecarboxylase.phosphofnictokinase, glucose- 
6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephosphateisomerase.phosphoglucose 
isomerase, and glucokinase. 

Other yeast promoters, which are inducible promoters having the additional advantage of transcription 
controlled by growth conditions, are the promoter regions for alcohol dehydrogenase 2, isocytochrome C, acid 
phosphatase, degradative enzymes associated with nitrogen metabolism, metallothionein, glyceraIdehyde-3- 
phosphate dehydrogenase, and enzymes responsible for maltose and galactose utilization. Suitable vectors and 
promoters for use in yeast expression are further described in EP 73,657. 

PRO transcription from vectors in mammalian host cells is controlled, for example, by promoters 
obtained from the genomes of viruses such as polyoma virus, fowlpox virus (UK 2,211,504 published 5 July 
1989), adenovirus (such as Adenovirus 2), bovine papilloma virus, avian sarcoma virus, cytomegalovirus, a 
retrovirus, hepatitis-B virus and Simian Virus 40 (SV40), from heterologous mammalian promoters, e.g., the 
actin promoter or an immunoglobulin promoter, and from heat-shock promoters, provided such promoters are 
compatible with the host cell systems. 

Transcription of a DNA encoding the PRO by higher eukaryotes may be increased by inserting an 
enhancer sequence into the vector. Enhancers are cis-acting elements of DNA, usually about from 10 to 300 
bp, that act on a promoter to increase its transcription. Many enhancer sequences are now known from 
mammalian genes (globin, elastase, albumin, a-fetoprotein, and insulin). Typically, however, one will use an 
enhancer from a eukaryotic cell virus. Examples include the SV40 enhancer on the late side of the replication 
origin (bp 100-270), the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the 
replication origin, and adenovirus enhancers. The enhancer may be spliced into the vector at a position 5' or 
3' to the PRO coding sequence, but is preferably located at a site 5' from the promoter. 

Expression vectors used in eukaryotic host cells (yeast, fungi, insect, plant, animal, human, or nucleated 
cells from other multicellular organisms) will also contain sequences necessary for the termination of 
transcription and for stabilizing the mRNA. Such sequences are commonly available from the 5' and, 
occasionally 3 ' , untranslated regions of eukaryotic or viral DN As or cDN As . These regions contain nucleotide 
segments transcribed as polyadenylated fragments in the untranslated portion of the mRNA encoding PRO. 

Still other methods, vectors, and host cells suitable for adaptation to the synthesis of PRO in 
recombinant vertebrate cell culture are described in Gething et al., Nature . 293:620-625 (1981); Mantei et al., 
Nature . 281:40-46 (1979); EP 117,060; and EP 117,058. 
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Ribozymes are enzymatic RNA molecules capable of catalyzing the specific cleavage of RNA. 
Ribozymes act by sequence-specifichybridization to the complementary target RNA, followed by endonucleolytic 
cleavage. Specific ribozyme cleavage sites within a potential RNA target can be identified by known techniques. 
For further details see, e.g., Rossi, Current Biology. 4:469-471 (1994), and PCT publication No. WO 97/33551 
(published September 18, 1997). 
5 Nucleic acid molecules in triple-helix formation used to inhibit transcription should be single-stranded 

and composed of deoxynucleotides. The base composition of these oligonucleotides is designed such that it 
promotes triple-helix formation via Hoogsteen base-pairing rules, which generally require sizeable stretches of 
purines or pyrimidines on one strand of a duplex. For further details see, e.g., PCT publication No. WO 
97/33551, supra. 

10 These small molecules can be identified by any one or more of the screening assays discussed 

hereinabove and/or by any other screening techniques well known for those skilled in the art. 

PR0189 can be used in assays with W01A6.1 of C. Elegans, phosphodiesterases, transporters and 
proteins which bind to fatty acids, to determine the relative activities of PR0189 against these proteins. The 
results can be applied accordingly. 

F. Anti-PRO Antibodies 
The present invention further provides anti-PRO antibodies. Exemplary antibodies include polyclonal, 
monoclonal, humanized, bispecific, and heteroconjugate antibodies. 



20 1. Polyclonal Antibodies 

The anti-PRO antibodies may comprise polyclonal antibodies. Methods of preparing polyclonal 
antibodies are known to the skilled artisan. Polyclonal antibodies can be raised in a mammal, for example, by 
one or more injections of an immunizing agent and, if desired, an adjuvant. Typically, the immunizing agent 
and/or adjuvant will be injected in the mammal by multiple subcutaneous or intraperitoneal injections. The 

25 immunizing agent may include the PRO polypeptide or a fusion protein thereof. It may be useful to conjugate 
the immunizing agent to a protein known to be immunogenic in the mammal being immunized. Examples of 
such immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, bovine 
thyroglobulin, and soybean trypsin inhibitor. Examples of adjuvants which may be employed include Freund's 
complete adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose dicorynomycolate). 
30 The immunization protocol may be selected by one skilled in the an without undue experimentation. 

2. Monoclonal Antibodies 
The anti-PRO antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies may be 
prepared using hybridoma methods, such as those described by Kohler and Milstein, Nature . 256:495 (1975). 
35 In a hybridoma method, a mouse, hamster, or other appropriate host animal, is typically immunized with an 
immunizing agent to elicit lymphocytes that produce or are capable of producing antibodies that will specifically 
bind to the immunizing agent. Alternatively, the lymphocytes may be immunized in vitro. 
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The immunizing agent will typically include the PRO polypeptide or a fusion protein thereof. 
Generally, either peripheral blood lymphocytes ("PBLs") are used if cells of human origin are desired, or spleen 
cells or lymph node cells are used if non-human mammalian sources are desired. The lymphocytes are then 
fused with an immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a 
hybridoma cell [Goding, Monoclonal Antibodies: Principles and Practice. Academic Press, (1986) pp. 59-103]. 
5 Immortalized cell lines are usually transformed mammalian cells, particularly myeloma cells of rodent, bovine 
and human origin. Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells may be 
cultured in a suitable culture medium that preferably contains one or more substances that inhibit the growth or 
survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme hypoxanthine 
guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the hybridomas typically will 
10 include hypoxanthine, aminopterin, and thymidine ("HAT medium"), which substances prevent the growth of 
HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level expression of 
antibody by the selected antibody-producing cells, and are sensitive to a medium such as HAT medium. More 
preferred immortalized cell lines are murine myeloma lines, which can be obtained, for instance, from the Salk 
1 5 Institute Cell Distribution Center, San Diego, California and the American Type Culture Collection, Manassas, 
Virginia. Human myeloma and mouse-human heteromyeloma cell lines also have been described for the 
production ofhuman monoclonal antibodies [Kozbor, J. Immunol.. 133:3001 (1984);Brodeuretal., Monoclonal 
Antibody Production Techniques and A pplications. Marcel Dekker, Inc., New York, (1987) pp. 51-63]. 

The culture medium in which the hybridoma cells are cultured can then be assayed for the presence of 
20 monoclonal antibodies directed against PRO. Preferably, the binding specificity of monoclonal antibodies 
produced by the hybridoma cells is determined by immunoprecipitation or by an in vitro binding assay, such as 
radioimmunoassay (RIA) or enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are 
known in the art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem.. 107:220 (1980). 
25 After the desired hybridoma cells are identified, the clones may be subcloned by limiting dilution 

procedures and grown by standard methods [Goding, supraj. Suitable culture media for this purpose include, 
for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. Alternatively, the hybridoma cells 
may be grown in vivo as ascites in a mammal. 

The monoclonal antibodies secreted by the subclones may be isolated or purified from the culture 
30 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for example, protein 
A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or affinity chromatography. 

The monoclonal antibodies may also be made by recombinant DNA methods, such as those described 
in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the invention can be readily isolated 
and sequenced using conventional procedures (e.g., by using oligonucleotide probes that are capable of binding 
35 specifically to genes encoding the heavy and light chains of murine antibodies). The hybridoma cells of the 
invention serve as a preferred source of such DNA. Once isolated, the DNA may be placed into expression 
vectors, which are then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, 
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or myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of monoclonal 
antibodies in the recombinant host cells. The DNA also may be modified, for example, by substituting the 
coding sequence for human heavy and light chain constant domains in place of the homologous murine sequences 
[U.S. Patent No. 4,816,567; Morrison et al., supral or by covalently joining to the immunoglobulin coding 
sequence all or part of the coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
polypeptide can be substituted for the constant domains of an antibody of the invention, or can be substituted for 
the variable domains of one antigen-combining site of an antibody of the invention to create a chimeric bivalent 
antibody. 

The antibodies may be monovalent antibodies. Methods for preparing monovalent antibodies are well 
known in the art. For example, one method involves recombinant expression of immunoglobulin light chain and 
modified heavy chain. The heavy chain is truncated generally at any point in the Fc region so as to prevent 
heavy chain crosslinking. Alternatively, the relevant cysteine residues are substituted with another amino acid 
residue or are deleted so as to prevent crosslinking. 

In vitro methods are also suitable for preparing monovalent antibodies. Digestion of antibodies to 
produce fragments thereof, particularly, Fab fragments, can be accomplished using routine techniques known 
15 in the art. 

3. Human and Humanized Antibodies 
The anti-PRO antibodies of the invention may further comprise humanized antibodies or human 
antibodies. Humanized forms of non-human (e.g., murine) antibodies are chimeric immunoglobulins, 
20 immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab\ F(ab') 2 or other antigen-binding 
subsequences of antibodies) which contain minimal sequence derived from non-human immunoglobulin. 
Humanized antibodies include human immunoglobulins (recipient antibody) in which residues from a 
complementary determining region (CDR) of the recipient are replaced by residues from a CDR of a non-human 
species (donor antibody) such as mouse, rat or rabbit having the desired specificity, affinity and capacity. In 
25 some instances, Fv framework residues of the human immunoglobulin are replaced by corresponding non-human 
residues. Humanized antibodies may also comprise residues which are found neither in the recipient antibody 
nor in the imported CDR or framework sequences. In general, the humanized antibody will comprise 
substantially all of at least one, and typically two, variable domains, in which all or substantially all of the CDR 
regions correspond to those of a non-human immunoglobulin and all or substantially all of the FR regions are 
30 those of a human immunoglobulin consensus sequence. The humanized antibody optimally also will comprise 
at least a portion of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin [Jones 
etal.,N 3 tuj^m:522-525(1986) : Riech m annetal., Nature, 332:323-329 (1988); and Presta, Curr. On. Struct. 
Biol. . 2:593-596 (1992)]. 

Methods for humanizing non-human antibodies are well known in the art. Generally, a humanized 
35 antibody has one or more amino acid residues introduced into it from a source which is non-human. These non- 
human amino acid residues are often referred to as "import" residues, which are typically taken from an "import- 
variable domain. Humanization can be essentially performed following the method of Winter and co-workers 
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[Jones et al., Nature . 321:522-525 (1986); Riechmann et al., Nature . 222:323-327 (1988); Verhoeyen et al., 
Science . 222:1534-1536 (1988)], by substituting rodent CDRs or CDR sequences for the corresponding 
sequences of a human antibody. Accordingly, such "humanized" antibodies are chimeric antibodies (U.S. Patent 
No. 4,816,567), wherein substantially less than an intact human variable domain has been substituted by the 
corresponding sequence from a non-human species. In practice, humanized antibodies are typically human 
5 antibodies in which some CDR residues and possibly some FR residues are substituted by residues from 
analogous sites in rodent antibodies. 

Human antibodies can also be produced using various techniques known in the an, including phage 
display libraries [Hoogenboom and Winter, J. Mol. Biol.. 227:381 (1991); Marks etal., J. Mol. Biol.. 222:581 
(1991)]. The techniques of Cole et al. and Boerner et al. are also available for the preparation of human 
10 monoclonal antibodies (Cole et al. , Monoclonal Antibodies and Cancer Therapy. Alan R. Liss, p. 77 (1985) and 
Boerner et al., J. Immunol., 147(l) :86-95 (1991)]. Similarly, human antibodies can be made by introducing 
of human immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous immunoglobulin 
genes have been partially or completely inactivated. Upon challenge, human antibody production is observed, 
which closely resembles that seen in humans in all respects, including gene rearrangement, assembly, and 
15 antibody repertoire. This approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 
5,569,825; 5,625,126; 5,633,425; 5,661,016, and in the following scientific publications: Marks et al., 
Bio/Technology 10, 779-783(1992); Lonberg g/g/„ Nature 368856-859(1994); Morrison, Nature 368, 812-13 
(1994); Fishwild etal., Nature Biotechnology 14 845-51 (1996); Neuberger, Nature Biotechnology 14 826 
(1996); Lonberg and Huszar, Intern. Rev. Immunol. 13 65-93 (1995). 

20 

4. Bispecific Antibodies 
Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that have binding 
specificities for at least two different antigens. In the present case, one of the binding specificities is for the 
PRO, the other one is for any other antigen, and preferably for a cell-surface protein or receptor or receptor 
25 subunit. 

Methods for making bispecific antibodies are known in the an. Traditionally, the recombinant 
production of bispecific antibodies is based on the co-expression of two immunoglobulin heavy-chain/light-chain 
pairs, where the two heavy chains have different specificities [Milstein and Cuello, Nature . 305:537-539 (1983)]. 
Because of the random assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) 
30 produce a potential mixture of ten different antibody molecules, of which only one has the correct bispecific 
structure. The purification of the correct molecule is usually accomplished by affinity chromatography steps. 
Similar procedures are disclosed in WO 93/08829, published 13 May 1993, and in Traunecker et al., EMBO 
L, 10:3655-3659 (1991). 

Antibody variable domains with the desired binding specificities (antibody-antigen combining sites) can 
35 be fused to immunoglobulin constant domain sequences. The fusion preferably is with an immunoglobulin 
heavy-chain constant domain, comprising at least pan of the hinge, CH2, and CH3 regions. It is preferred to 
have the first heavy-chain constant region (CHI) containing the site necessary for light-chain binding present in 
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ID AAY66729 standard; protein; 775 AA. 
XX 

AC AAY66729; 
XX 

DT 05-APR-2000 (first entry) 
XX 

DE Membrane -bound protein PR01281. 
XX 

KW Membrane -bound polypeptide; PRO polypeptide; LDL receptor; TIE ligand; 

KW pharmaceutical; receptor immunoadhesin; gene mapping. 

XX 

OS Homo sapiens . 
XX 

PN WO9963088-A2 . 
XX 

PD 09-DEC-1999. 
XX 

PF 02-JUN-1999; 99WO-US12252 . 
XX 

PR 02-JUN-1998; 98US-0087607 . 
XX 

PA (GETH ) GENENTECH INC. 
XX 

PI Baker K, Chen J, Goddard A, Gurney AL, Smith V, Watanabe CK; 

PI Wood WI, Yuan J; 

XX 

DR WPI; 2000-072883/06. 

DR N-PSDB; AAZ65074 . 
XX 

PT Membrane -bound proteins and related nucleotide sequences 
XX 

PS claim 12; Fig 233; 822pp; English. 
XX 

CC The invention provides membrane -bound PRO polypeptides and 

CC polynucleotides encoding them. The PRO sequences of .the invention were 

CC identified based on extracellular domain homology screening. The PRO 

CC sequences have homology with proteins including LDL receptors, TIE 

CC ligands and various enzymes. The membrane -bound proteins and receptor 

CC molecules are useful as pharmaceutical and diagnostic agents. Receptor 

CC immunoadhesins, for instance, can be used as therapeutic agents to block 

CC receptor- ligand interactions. The membrane -bound proteins can also be 

CC employed for screening of potential peptide or small molecule inhibitors 

CC of the relevant receptor/ ligand interaction. The PRO encoding sequences 

CC are useful as hybridization probes, in chromosome and gene mapping and in 

CC the generation of antisense RNA and DNA. PRO nucleic acid sequences 

CC will also be useful for the preparation of PRO polypeptides, especially 

CC by recombinant techniques . 

XX 

SQ Sequence 775 AA; 

Query Match 100.0%; Score 4074; DB 21; Length 775; 
Best Local Similarity 100.0%; Pred. No. 0 ; 

Matches 775; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MRASLLLSVLRPAGPVAVGISLGFTLSLLSVTWVEEPCGPGPPQPGDSELPPRGNTNAAR 60 

IIMIIIIIIIMIIIIMIIIIIIIirillllllllMM || Ml minimi Mm 

Db 1 MRASLLLSVLRPAGPVAVGISLGFTLSLLSVTWVEEPCGPGPPQPGDSELPPRGNTNAAR 60 

0^ 61 RPNSVQPGAEREKPGAGEGAGENWEPRVLPYHPAQPGQAAKKAVRTRYISTELGIRQRLL 120 

m i r 1 1 m 1 1 1 1 1 j i m 1 1 1 1 1 r i f f 1 1 1 1 1 1 1 i.i f 1 1 1 1 j 1 1 1 1 1 1 r 1 1 ii i r 1 1 1 1 1 1 

Dt> 61 RPNSVQPGAEREKPGAGEGAGENWEPRVLPYHPAQPGQAAKKAVRTRYISTELGIRQRLL 120 



Qy 121 VAVLTSQTTLPTLGVAVNRTLGHRLERWFLTGARGRRAPPGMAWTLGEERP I GHLHLA 180 

Db 121 VAVLTSQTTLPTLGVAVNRTLGHRLERWFLTGARGRRAPPGMAWTLGEERPIGHLHLA 180 

Qy 181 LRHLLEQHGDDFDWFFLVPDTTYTEAHGLARLTGHLSLASAAHLYLGRPQDFIGGEPTPG 24 0 

Db 181 LRHLLEQHGDDFDWFFLVPDTTYTEAHgLvRLTGHLSLASAAHLYLGRPQ^ 24 0 

QY 241 RYCHGGFGVLLSRMLLQQLRPHLEGCRNDIVSARPDEWLGRCILDATGVGCTGDHEGVHY 300 

Db 241 RYCHGGFGVLLSRMLLQQLRPHLEGCRND^ 300 

Qy 301 SHLELS PGEPVQEGDPHFRSALTAHPVRDPVHMYQLHKAFARAELERTYQE I QELQWE I Q 360 

Db 3 01 SHLELSPGEPVQEGDPHFRSALTAHPVRDPVHMYQLHKAFARAELERTYQEIQELQWEIQ 360 

QY 361 NTSHLAVDGDRAAAWPVGIPAPSRPASRFEVLRWDYFTEQHAFSCADGSPRCPLRGADRA 420 

Db 361 NTSHLAVDGDRAAAWPVGIPAPSRPASRFEVLRWDYFTEQHAFSCADGSPRCPLRGADRA 420 

QY 421 DVADVLGTALEELNRRYHPALRLQKQQLVNGYRRFDPARGMEYTLDLQLEALTPQGGRRP 4 80 

Db 421 DVADVLGTALEELNRRYHPALRLQKQQLWGYRRFDPARGMEYTLDLQLEALTPQGGRRP 480 

QY 481 LTRRVQLLRPLSRVEILPVPYVTEASRLTVLLPLAAAERDLAPGFLEAFATAALEPGDAA 540 

Db 481 LTRRVQLLRPLSRVEILPVPYvliii^ 540 

Qy 541 AALTLLLLYEPRQAQRVAHADVFAPVICAHVAELERRFPGARVPWLSVQTAAPSPLRLMDL 600 

Db 541 AALTLLLLYEPRQAQRVAHADVFAPVK^ 600 

QY 601 LSKKHPLDTLFLLAGPDTVLTPDFLiNRCRMHAISGWQAFFPMHFQAFHPGVAPPQGPGPP 660 

Db 601 LSKKHPLDTLFLLAGPDTVLTPDFLNRCRMHAISGWQAFFPMHFQAFHPGVAPPQGPGPP 660 

Qy 661 ELGRDTGRFDRQAASEACFYNSDYVAARGRLAAASEQEEELLESLDVYELFLHFSSLHVL 720 

Db 661 ELGRDTGRFDRQAASEACFYNSDYVAARGRLAAAS^ 720 

Qy 721 RAVEPALLQRYRAQTCSARLSEDLYHRCLQSVLEGLGSRTQLAMLLFEQEQGNST 775 

Db 721 RAVEPALLQRYRAQTCSARLSEDLYHRCLQ^ 775 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
20 phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No ' 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No 
PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 
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4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1 786 and 3573-53 58 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
NO:l-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO: 1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 
amino acid sequences set forth as SEQ ID NO:1787-3572 and 5359-7144 or the corresponding 
5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 
65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 
about 90%, typically at least about 95%, more typically at least about 98%, or most typically at. 
least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 
allelic variants may have a similar, increased, or decreased activity compared to polypeptides 
0 comprising SEQ ID NO: 1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et ah, Bio/Technology 10, 773-778 (1 992) and in R. S. McDowell, et al, J. Amer. 
5 Chcm. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
sequence is identified in the sequence listing by translation of the disclosed nucleotide 
sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 
1 0 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 
1 5 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 
0 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example,- the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 
> culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 
methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
1 0 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787-3572 and 5359-7144. 
1 5 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
5 an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
10 invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
15 of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™ ; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or irnmunoaffinity chromatography. 
20 Alternatively, the protein of the invention may also be expressed in a form which will 

facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 
25 respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
30 aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g!, antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
0 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 



4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-3 1 (1 982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410(1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "fusion protein" comprises a polypeptide of the invention operarively linked to 
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With the complete human genomic sequence being unraveled, the focus will shift to gene identification and to 
the functional analysis of gene products. The generation of a set of cDNAs, both sequences and physical dones, 
which contains the complete and noninterrupted protein coding regions of all human genes will provide the 
indispensable took for the systematic and comprehensive analysis of protein function to eventually understand 
the molecular basis of man. Here we report the sequencing and analysis of 500 novel human cDNAs containing 
the complete protein coding frame. Assignment to functional categories was possible for 52% (2591 of the 
encoded proteins, the remaining fraction having no similarities with known proteins. By aligning the cDNA 
sequences with the sequences of the finished chromosomes 21 and 22 we identified a number of genes that 
eimer had been completely missed in the analysis of the genomic sequences or had been wrongly predicted. 
Three of these genes appear to be present in several copies. We conclude that full-length cDNA sequencing 
•™7 n ^f" ^1 a ' S ? [° r the aCCUrate identificati °n of genes. The set of 500 novel cDNAs, and another 
ITc o/ ^ known transcripts we have identified, adds up to cDNA representations covering 

ZZr, °L a ." h T™ geneS - We thus subjtan tially contribute to the generation of a gene catalog, consisting of 
both full-codmg cDNA sequences and clones, which should be made freely available and will become an 
invaluable tool for detailed functional studies. 

Sn^Table 2.]° d * albed P3Per submitted t0 tne EMBL database under tne acce *"'on nos. 



The recent past has witnessed major advances in the 
determination of the sequence of the human genome 
(Dunham et al. 1999; Hattori et al. 2000). Although the 
whole genomic sequence will be completely unraveled 
in the near future (Collins et al. 1998), the identifica- 
tion of genes and the deciphering of gerie structures 
will extend for a prolonged time, and cDNA sequences 
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will continue to be invaluable tools for this adventure, 
especially in view of alternative splicing. The primary 
focus will shift to the functional analysis of the genes 
and their protein products to finally understand the 
molecular basis of human life. Current estimates vary 
between 29,000 and >7Q,000 genes to constitute the 
protein coding repertoire of the human genome (Fields 
et al. 1994; Ewingand Green 2000; Liang et al. 2000; 
Roest Crollius et al. 2000). However, thus far only some 
11,000 cDNA sequences have been deposited in public 
databases, which are supposed to contain the complete 
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Cataloging Human Genes: The German cDNA Consortium 



protein coding open reading frame (ORF). The major- 
ity of the respective cDNA clones are most likely not 
accessible. The generation of a physical clone set rep- 
resenting all human genes that should be made freely 
accessible is consequently regarded to have an ex- 
tremely high impact (Schuler 1997; Pruitt et al. 2000). 
This would permit the establishment of a catalog of 
clones to provide the resources needed in the proteom- 
ics era where the functions of proteins, their action in 
pathways, and the possible disease relation are deci- 
phered. 

Until recently, the long-cDNA sequencing project 
carried out at the Kazusa Institute (Nomura et al. 1994; 
Nagase et al. 2000) Consortium had been the only sys- 
tematic full-length cDNA sequencing project with a 
significant output of novel sequence information. The 
initiation of a new large-scale cDNA sequencing 
project has been announced lately that is coordinated 
by the National Institute of Health (Strausberg et al. 
1999). We founded a cDNA Consortium in 1997 as part 
of the German Genome Project and aim at the charac- 
terization of the complete sequences of novel human 
transcripts at the cDNA level. 

Here, we report the sequences and analysis of 500 
novel human cDNAs that all contain the complete pro- 
tein coding region. These cDNAs constitute the most 
valuable essence of 30,000 clones that have been EST 
sequenced and 3630 fully sequenced cDNAs. Over 
1000 cDNAs that cover the complete coding sequence 
of already known transcripts have been identified in 
the EST-sequenced clone set. All clones are made avail- 
able through the Resource Center of the German Ge- 
nome Project (RZPD). 

RESULTS 

Libraries and Clones 

To identify and sequence novel human cDNAs we have 
5'-EST sequenced >30,000 independent cDNA clones. 
Bioinformatic evaluation of these sequences (Fig. 1) led 
to the identification of full-coding clones of already 
known proteins (>1000), and to cDNA clones lacking 
database hits, which are potential targets for full- 
length sequencing. Presumably novel cDNAs were 3'- 
EST sequenced and again analyzed for novelty. Out of 
the initial clones, 3630 cDNAs have been fully se- 
quenced thus far, totaling 8.8 Mb. The sequence subset 
described here comprises 500 novel human cDNAs 
that are representations of the complete protein cod- 
ing part of the original transcripts. Also the other fully 
sequenced cDNAs represent mostly genes that have 
not been fully sequenced elsewhere; however, the 
clones are not likely to contain the complete protein 
coding region of the respective transcripts, or they con- 
tain frame-shift mutations that have probably been in- 
troduced during reverse transcription in the cloning 
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Figure 1 Flow of clones, sequences, and information in the 
German cDNA Consortium. 5 ' EST sequences were systematically 
generated from the clones of 384-well microtiter plates and ana- 
lyzed for hits in public databases. Clones with novel sequences 
were 3'-EST sequenced and these ESTs were analyzed again for 
novelty. Clones of uncharacterized transcripts were reported 
back to the sequencers who then did the full-length sequencing 
of cDNAs. The final sequence was analyzed comprehensively with 
bioinformatic tools and the outputs were evaluated manually. 
The clones feed functional analysis projects that take advantage 
of the clone resources generated. 

process. Therefore, these clones are only of reduced 
value for functional analysis. The number of bases re- 
ported for the 500 full-coding cDNAs is 1,264,620 bp; 
the average insert size of the clones is 2529 bp. The 
clones originate from five different cDNA libraries that 
have been sampled in varying numbers of clones 
(Table 1) to maximize the likelihood of identifying 
novel cDNAs. 

The calculated average size of the encoded pro- 
teins was 470 amino acid residues, which equals the 
number that has been reported previously for some 
1200 genes (Makalowski and Boguski 1998). There was, 
however, a wide variation between 66 and 1805 resi- 
dues. The cDNA identifiers, the respective sequence ac- 



Genome Research 

www.genome.org 



423 




Wiemann et al. 



Ta^li^U^ry ' DJst'ribu^ 



l< RZF-D library; ~ 
identifier 


. • : - . ' ■ ■ . ..-J'. £ ' 
' •-. : ■ ■' -1^4*i-.:> 

- •Tissue^ ^"""H 




cjopes^ 5 ,' : 
J^reporfea*. * 'fj 


;AverageVT;^ 
•,1'risert'* -'-Vi 
;$ize"(bp^- V. 


'Average 
ORF size 


UKhZpH-t 
DKFZp564^# 
~DkFZp566 ;• ;> 
. DkFZp58'6 ; "; :< ' -. 
SDkFZr>Z6f :; v 1 


I',--' ' 'jestS ^'";' ;< V 
: . ■ [.;. • Fetal braip \ : - 
: 1; ^Fetalkidriey;":^ ! W 

^ ; i^)^daia^Braln)Si 


_ "A 142-fi 


p ">io«H 


30S5 


562 I 

'-^328'H 
492,',"'' * 















cession numbers (EMBL/GenBank/DDBJ), cDNA sizes, 
the length of ORFs, the chromosomal location, and 
functional details for the individual cDNAs are broken 
down in Table 2. This table is available in its entirety at 
http://www.dkfz-heidelberg.de/abt0840/GCC. 

Features of 5'- and J'-Untranslated Regions 
The 5 '-untranslated regions (UTRs) averaged 148 nt, 
which is the same range as that reported previously 
(Pesole et al. 1996) but considerably shorter than the 
number (215 nt) calculated in the UTRdb (Pesole et al. 
2000). There was a wide variation in size ranging up to 
>800 nt (e.g., DKFZp761F182). Even this long 5'-UTR 
was consistent with the scanning model for transla- 
tional initiation (Kozak 1999) as there was no AUG 
codon in this stretch of sequence. In-frame stop 
codons upstream from the initiator ATG were present 
in 56.4% (282) of the cDNAs. This number is consistent 
with that observed with cDNAs isolated from oligo- 
nucleotide cap ligation libraries (Suzuki et al. 2000), 
where the cDNAs have been selected to contain the 
extreme 5' ends of the respective transcripts. The over- 
all GC content in the 5 '-UTRs (56.3%) was consider- 
ably higher than that in the coding regions (52.6%) 
and the 3'-UTRs (45.7%). This is consistent with the 
finding that CpG islands frequently extend into the 
transcribed sequence (Cross and Bird 1995) whereas 
elements present in the 3'-UTR are often AU rich (Xu et 
al. 1997). 

The average size of the 3'-UTRs was 926 nt [not 
including the poly(A) tail], which is considerably larger 
than the 388 nt and 820 nt reported by Makalowski 
and Boguski (1998) and Pesole et al. (1996), respec- 
tively. This discrepancy probably derives from the 
longer average size of the cDNAs described here, as 
compared with that observed in the previous studies. 
As with the 5'-UTR there was great variability with the 
size of the 3'-UTR. The translation terminator codon 
TAA could be part of the polyadenylation signal (e.g., 
in clone DKFZp564F2272) whereas in other cDNAs the 
3'-UTR was found to be >4000 nucleotides (eg 
DKFZp486C1218). 

We screened for the presence of repeat structures 
across the cDNA sequences. The Alu repeat family was 



most frequently contained in the cDNAs; 7.6 % (38) of 
the cDNA inserts carried this type of repeat. LI repeats 
were present in two cDNAs; one cDNA contained both 
LTR2 and Alu repeats (DKFZp761G18121). The repeat 
structures were, without exception, located in the 3'- 
UTR of the respective cDNAs. However, in a number of 
other cDNAs we found repeats also in the presumed 
5 '-UTRs. All of these clones turned out to be not comv 
pletely spliced and/or partial upon further analysis, 
and having intronic sequence at the 5' ends. We there- 
fore reason that the presence of repeat structures in 
5 '-UTRs of transcripts is rather rare. The lack of repeat 
structures in 5' EST sequences has since been imple- 
mented as criterion in the selection process of cDNA 
clones that are targeted to full-insert sequencing to fur- 
ther increase the impact of the project. 

Functional Classification 

We grouped the cDNAs into functional classes accord- 
ing to homologies of their encoded proteins with al- 
ready known proteins (Table 2 and Fig. 2): cell cycle, 
differentiation and development, membrane protein, 
metabolism, nucleic acid management, protein man- 
agement, signaling and communication, structure and 
motility, transport and traffic, and unknown. Se- 
quence annotations in databases sometimes were mis- 
leading, and the putative function of a protein could 
not be simply deduced by regarding the hit with the 
highest similarity as being the most significant. The 
integration of results from several search algorithms 
was necessary to draw relevant conclusions. For ex- 
ample, the deduced protein sequences were evaluated 
for the presence of specific (protein) sequence patterns 
necessary for the function/activity of a particular pro- 
tein [e.g., the DFG/DWG and aPE motifs had to be 
present in a protein kinase, as reported by Hanks et al. 
(1988)]. The results of this functional classification are 
given in Table 2. The largest class constitutes proteins 
of unknown function (202 cDNAs, 41%). Considering 
that for another 39 cDNAs (8%) the only prediction 
that had been possible was that the deduced proteins 
would contain a putative transmembrane domain, no 
function could be inferred to a total of 241 cDNAs 
(48%) of the predicted proteins. But even if functional 
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Figure 2 Functional classification of proteins encoded by the 
cDNAs. The deduced proteins were grouped into 10 functional 
categories based on sequence similarity with proteins of known 
function. The fraction of the 500 cDNAs grouped into the respec- 
tive categories is indicated. 



predictions were possible, the identification, for ex- 
ample, of a protein kinase, neither provides informa- 
tion on its substrates nor on the pathway(s) in which it 
is involved. Comprehensive functional analyses 
should be specifically indicated for a set of cDNAs en- 
coding candidates for genes related to disease, such as 
putative GTP binding proteins, ion channels, and a 
cDNA encoding a protein that is highly similar to an 
oncogene. 

We further analyzed the cDNAs for the presence of 
function-related sequence motifs to also identify novel 
members of gene families. We identified 41 potential 
leucine zipper proteins (Struhl 1989), 11 proteins with 
WD-domains (Neer et al. 1994), 11 proteins with pre- 
dicted zinc finger domains (Parraga et al. 1988), 7 po- 
tential protein kinases, and 5 RNA helicases. The re- 
spective clones are indicated in Table 2 (column 9). 
Two cDNAs (DKFZp586I021 and DKFZp43401826) 
contain both a WD-domain and a leucine zipper. A 
zinc-finger domain is predicted additionally for the de- 
duced protein of the former cDNA. 

Alternative Splicing 

We found 39 (7.8%) cDNAs to represent putative splice 
variants of already known transcripts. This number is 
likely to represent the lower end of the fraction of tran- 
scripts that are alternatively spliced in vivo as any 
cDNAs representing already fully-known transcripts 
were excluded from further sequencing and alternative 
splice forms should therefore be under-represented in 
our set. We found ORFs with additional exons (e.g., 
DKFZp761B192), skipped exons (e.g., DKFZp564A032), 
and alternative exons including one containing the 
translatJonal start codon and resulting in a different N 
terminus of the deduced peptide (e.g., DKFZp434J154). 



The percentage of alternatively spliced cDNAs ap- 
peared to be slightly higher in fetal brain, 40% of the 
alternatively spliced cDNAs originate from fetal brain 
whereas only 28% of all cDNAs analyzed originate 
from this tissue. This finding is consistent with reports 
by Sutcliffe and Milner (1988) and Hanke et al. (1999). 
The presence of intron sequences reminiscent in many 
cDNA sequences available in public databases, how- 
ever, might lead to an overestimation of the extent of 
alternative splicing that is taking place in vivo. Experi- 
mental evidence will therefore be needed to confirm 
presumed alternative splice forms. 

Representation of cDNAs in the UniGene Data Set 
Depending on the true number of human genes, about 
60%-90% have already been identified by partial se- 
quencing of >2,000,000 cDNAs (EST sequencing). 
Overlapping EST sequences have been clustered to 
break down this large number of ESTs to comprehen- 
sive collections that should consist of nonredundant 
data sets having one representation (cluster) for every 
gene. The most widely accepted clustering data set is 
the UniGene (Schuler et al. 1996) resource at the NCB1 
(http://www.ncbi.nlm.nih.gov/UniGene/). This 
dataset currently consists of >90,000 clusters of mostly 
partial sequences. Consensus sequences of these clus- 
ters are available from http://www.rzpd.de. To investi- 
gate the representation of the novel cDNAs reported 
here in the UniGene data set and to evaluate the maxi- 
mum number of genes that could be represented there, 
we aligned the full-length sequences with the UniGene 
database. The version of UniGene (Build 105) that was 
used in the analysis consisted of 92,931 clusters with 
10,501 clusters containing known genes. 

In total, 626 UniGene clusters matched with 472 
out of the 500 full-coding cDNA sequences. The ma- 
jority of cDNAs (342, 68%) was represented by one 
UniGene cluster. An additional 130 (26%) cDNAs were 
represented by 284 separate UniGene clusters (Fig. 3). 
Thus, a number of UniGene clusters could be linked by 
the full-length cDNA sequences. An example of three 
UniGene clusters that were joined with one cDNA is 
given in Figure 4. We analyzed the ESTs and clusters 
that were placed internal to the cDNAs reported here 
and found that most of the EST clones making up these 
clusters had originated from internal priming events 
(mostly in reminiscent intron sequences) and not from 
alternative polyadenylation. The number of 640 dus- 
ters that was hit with 472 cDNA sequences implies that 
there is -35% redundancy in UniGene. As the average 
size of the human transcripts in general has been esti- 
mated to be in the same range as the average size of the 
cDNAs reported here (by quantification of Northern 
blots that had been hybridized with a labeled oligo- 
nucleotide dT probe; N. Nomura, pers. comm.), our 
finding should be representative. However, the true 
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Figure 3 Representation of cDNAs in the UniCene data set 
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number of genes represented in UniGene will further 
condense as a considerable fraction of the UniGene 
clusters are singletons (-39%), which are clusters made 
up by only one cDNA, and several of these will even- 
tually turn out to be artifacts. Consequently, we esti- 
mate the number of independent genes that are repre- 
sented in UniGene to be 50,000 at most. 

A fraction of 6% (28 cDNAs) did not have hits in 
the UniGene database (cutoff, sequence identity >95% 
in 50 bp). The low number of the novel cDNAs without 
UniGene matches might in turn imply that >90% of all 
human genes were already represented in this data- 
base. However, we would rather assume that an un- 
known number of genes has escaped cloning and/or 
identification so far as the respective transcripts might 
be expressed only at extremely low levels or in very 
specialized cell types or differentiation stages. A proper 
selection of tissues or even single cell types for cDNA 
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Figure 4 Three UniCene clusters are joined when aligned with 
the cDNA sequence DKFZp434B0435. The bar on top of the scale 
represents the cDNA with the open reading frame drawn as an 
open box. The bars below the scale represent the position and 
size (in bp) of the three UniCene clusters that are joined by the 
cDNA sequence. The accession nos. of representative sequences 
of the respective UniCene clusters are given below the bars. 



library production will be a critical issue for the detec- 
tion and cloning also of these rarely expressed tran- 
scripts. For example, fetal brain, although very com- 
plex in expression, has been so deeply sampled in EST 
projects [especially the IMAGE 1NIB library (Soares et 
al. 1994)] but also in full-length cDNA sequencing (Na- 
gase et al. 2000) that the novelty rate (3 of 142 cDNAs, 
2%) is rather low in this tissue. In contrast, testis cur- 
rently appears to have a higher potential for identify- 
ing transcripts not yet covered by ESTs (19 of 204 
cDNAs, 9%). 

Tissue Specificity of Expression 
To analyze for a possible tissue specificity of expression 
we aligned the cDNA sequences with the EST database 
dbEST. ESTs originating from pooled tissues and tissues 
with unclear origin were excluded. Each cDNA re- 
ceived a score indicating the degree of tissue specific- 
ity. The higher this score, the higher the likelihood 
that expression of the particular transcript should be 
restricted to that tissue. A ubiquitously expressed tran- 
script would have had a score of one. Only cDNAs with 
scores of five or higher are indicated in Table 2 (col- 
umns 10-12). In total, the expression of 22 transcripts 
appeared to be restricted to only one tissue with 
matching tissues of our cDNA and the ESTs (Table 2). 
Sue brain-derived cDNAs only matched ESTs that had 
derived from brain tissues. Most of the cDNAs encode 
proteins that are either involved in the cell cycle or 
signaling pathways, for example, a stathmin-like pro- 
tein and a protein similar to a calmodulin-binding pro- 
tein. Only one of the six cDNAs encodes a protein of 
unknown function. Another 15 testis cDNAs had hits 
only with ESTs from testis/male genital tract. Although 
predictions could be made for three of the encoded 
proteins (a predicted sperm flagellar protein, a putative 
neurotransmitter transporter, and a possible nuclear 
pore protein), the other 12 cDNAs encode proteins of 
unknown function. The only uterus cDNA predicted to 
be specifically expressed in uterus/ovary encodes a pu- 
tative chaperone-assodated protease, which could in- 
dicate that this protein might be involved in the dif- 
ferentiation of the egg or embryo. The expression of 
several testis-derived transcripts appeared to be very 
selective as the scores calculated for these cDNAs were 
rather high, compared with scores obtained with other 
cDNAs and tissues (Table 2). This also matches the ob- 
servation that the novelty rate, counting cDNAs with- 
out EST hits, was highest in the testis library (see 
above). 

cDNAs Mapping to Human Chromosomes 21 and 22 
To demonstrate the power of mapping genes by align- 
ing cDNA with genomic sequences we downloaded the 
sequences of the first two completely sequenced hu- 
man chromosomes 21 (Hattori et al. 2000) and 22 
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(Dunham et al. 1999) and aligned them with those 
novel cDNAs mapping to the respective chromosomes 
(Table 3). Clone identifiers of the respective cDNAs and 
the insert and ORF sizes are provided in the first three 
columns. For ORF sizes (column 3) the predicted num- 
ber of amino acid residues is given first, followed by the 
number of the residues deduced from the cDNA se- 
quence; a dash (-) is inserted for proteins that were not 
predicted. The predicted localization as based on 
mainly STS data is given in the fourth column, fol- 
lowed by the exact localization of the genes (gene locus 
in bp as defined in the published sequences of chro- 
mosome 21, http://hgp.gsc.riken.go.jp, and chromo- 
some 22, http://www.sanger.ac.uk/cgi-bin/cwa/ 
22cwa.pl). The accession numbers of the genomic 
clone(s) covering the genes, identifiers of predicted 
transcripts (if available; dashes indicate nonpredicted 
genes), the number of predicted exons out of the num- 
ber of identified exons (based on cDNA sequence), and 
the number of UniGene clusters that were hit with the 
respective cDNAs are given in columns 6-9. 

Whereas 13 of the novel cDNAs map to chromo- 
some 22, only two cDNAs map to chromosome 21. 
This could either be a reflection of the generally higher 
gene content of chromosome 22 (554 compared with 
the 225 predicted genes on chromosome 21) or be a 
result of the fact that the percentage of genes that had 
been known previously is higher for chromosome 21 
(this chromosome had long been carefully investigated 
because of its clinical implications, e.g., in Down syn- 
drome). A third explanation could be a correlation be- 
tween chromosomal location and global expression 
levels of the individual genes, as has been proposed by 
Ewing and Green (2000), with genes mapping to chro- 
mosome 21 in general possibly being expressed at 
lower levels compared with genes located on chromo- 
some 22. 

By combining the genomic and cDNA data, the 
exact gene structures of all 15 cDNAs could be deter- 
mined. Although all cDNAs were covered by UniGene 
clusters, only 8 of the 15 genes had been predicted 
from the genomic sequence. Most of these gene pre- 
dictions were precise, identifying the majority or all 
exons. The number of amino acid residues varied in 
most cases only marginally from the number deduced 
from the cDNA sequence. However, one cDNA 
(DKFZp564B212) merged three predicted transcripts to 
only one gene and overlapped another gene 
(bK445C9.C22.3) predicted on the opposite strand. In 
total, seven genes had completely failed to be pre- 
dicted, some of which encode rather large ORFs and 
consist of several exons. 

The mapping information that is based on ge- 
nomic sequence not only gives the exact localization 
of individual genes but also provides information on 
the context of these genes in view of neighboring 



genes (e.g., DKFZp434B194 and DKFZp564B212 are 
only 13 kb apart) and the presence of probable addi- 
tional gene copies. For example, the genes of cDNAs 
DKFZp434N035 and DKFZp434P211 appear to be pres- 
ent on chromosome 22 in 2 and 9 highly similar copies 
(>85% sequence identity on nucleotide level), respec- 
tively. DKFZp434P211 could indicate a cluster of 
highly similar POM121 related genes (Fig. 5), the first 
of which was described by Kawasaki et al. (1997). Two 
copies (2850458 and 2871777) seem to be ancient and 
inactive as they are incomplete, contain several frame 
shifts, and share only 89% and 87% sequence identity 
with the cDNA sequence in exon 1, respectively. The 
other copies are highly similar (>95% identity on 
nucleotide level). Further experiments will be neces- 
sary to investigate how many of the gene copies are 
expressed and to explain the presence of the stop 
codon at position 429 in three of the gene copies (and 
in the cDNA) but a sense codon in this position in four 
other gene copies, possibly leading to an extended pro- 
tein product. EST evidence is available for transcripts of 
both types of genes (e.g., for copies 5055694 and 
8220566). 

DISCUSSION 

The considerable fraction of genes that were not pre- 
dicted in the analysis of the chromosome 21 and 22 
sequences was somewhat surprising, as EST data and 
UniGene clusters (Table 3) were also available for these 
genes. Three of the genes that were not predicted even 
appear to be present in more than one copy on the 
same chromosome, namely, within 6 Mb on chromo- 
some 22. But even if all genes could be identified via 
bioinformatic procedures, the alternative use of exons 
and promoters (alternative splicing) constitutes a prob- 
lem that cannot currently be solved with knowledge of 
the genomic sequence alone. Consequently, only the 
availability of cDNA sequences enables us to define the 
precise protein coding parts of the genome and, in con- 
junction with the genomic counterpart, to also define 
the composition of exons in alternatively spliced tran- 
scripts of the same gene. Both the sequence and the 
chromosomal location of genes are important pieces of 
information supportive also in the process of defining 
and analyzing candidate disease genes. 

Most of the genome has been unraveled as draft 
sequence, where sequence submissions of individ- 
ual genomic clones are released in several contigs 
of varying length. These contigs are usually not 
ordered relative to one another. However, automated 
assembly and annotation tools like GoldenPath 
(http://genome.ucsc.edu/goldenPath/hgTracks.html) 
try to overcome this problem and prove to be ex- 
tremely helpful for the mapping of cDNAs. The avail- 
ability of cDNA sequences in turn immediately helps 
to identify the genes that are located on the respective 
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genomic clones, to support the ordering of the draft 
sequence contigs, and to narrow down the regions 
where putative regulatory elements should reside. 
Thus, cDNA and genomic sequences are complemen- 
tary and synergistically add information. The blast 
analysis of cDNAs and matching genomic sequences 
showed that only 32 cDNAs did not have correspond- 
ing genomic matches (not covered, NC in Table 2, col- 
umn 5), which is the number expected because >91% 
of the genomic sequence are reported to be unraveled. 



The chromosomal localization could be approximated 
for 449 cDNAs using the GoldenPath web browser; 21 
BACs had not been mapped (NM). The accession num- 
bers of these BACs are provided in column 5 of Table 2. 
The combination of genomic and cDNA sequence pro- 
vides the gene structures with precise exon-intron 
boundaries and defined intron sequences. 

Furthermore, it will become increasingly impor- 
tant to not only have the human genes identified but 
rather to characterize the precise functions of the en- 
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coded proteins and also the functions of those tran- 
scripts that are not translated. To this end, full-coding 
cDNA representations are indispensable tools, for ex- 
ample, for the subdoning of exactly defined ORFs into 
expression vectors. However, currently only -11,000 
nonredundant cDNA sequences have been deposited 
in public databases which are supposed to contain the 
complete protein coding ORF. An even lower number 
of these full-coding ORFs can be obtained as cDNA 
clones through commercial or noncommercial provid- 
ers (e.g., ATCC, Genome Systems, Research Genetics, 
HGMP, Resource Center of the German Genome 
Project) and would thus be available for functional re- 
search. 

Recently, the range of estimates given for the 
number of human genes has evolved to the lower end, 
because in two calculations only -35,000 human genes 
have been predicted (Ewing and Green 2000; Roest 
Crollius et al. 2000). Our data would also hint at a 
lower than previously expected number, as we would 
estimate the number of genes currently represented in 
UniGene to be 50,000 at most. Still, the real number of 
human genes needs to be established by further cDNA 
and also by comparative genomic sequencing (e.g., of 
the mouse). If it should hold true, however, that the 
number of genes in human was indeed only about two- 
fold higher than the -18,000 genes that have been pre- 
dicted for Caenorhabditis elegans by The C. elegans Se- 
quencing Consortium (1998) the question would arise 
as to where the difference in complexity between these 
two life forms originated. Because the sheer doubling 
of gene number would not be likely to account for all 
differences, the comprehensive analysis of gene and 
protein function(s) would become an even greater 
problem. This is because one solution to this apparent 
paradox could be the acquisition of multiple functions 
by many of the proteins expressed in human. This 
would add another order of complexity to the line 
starting with the genome and continuing through the 
transcriptome with alternative splicing, the proteome 
with post-translational modifications, and finally (?) to 
a 'functiome,' which would cover the acquisition of 
diverse functions by the same protein depending on its 
cellular and subcellular environment. Several examples 
of such multiple usages of proteins have already been 
described (Jeffery 1999). 

In the set of 500 novel cDNAs described here, only 
about half of the deduced proteins could be function- 
ally classified, while identification, for example, of a 
protein kinase does not provide information on sub- 
strates or pathways in which this protein is involved. 
Additionally, half of the predicted proteins remain 
without any hint as to their possible function. With 
this in mind, the establishment of a gene catalog 
which will eventually contain a nonredundant set of 
full-coding cDNA sequences and clones covering every 



human gene, is prerequisite to carry out the experi- 
ments needed to precisely identify the protein func- 
tion^). This catalog should be the result of a global 
enterprise integrating the data and clones from as 
many projects and researchers as possible and could be 
an extension of already existing databases such as 
GeneCards (Rebhan et al. 1998) and RefSeq (Pruitt et 
al. 2000) with, for example, links to the clone providers 
mentioned above. In addition to the novel full-coding 
cDNA sequences and clones described here, we have 
identified over 1000 cDNAs which comprise full- 
coding representations of previously known genes. In 
combination, these cDNAs represent 2%-5% of all hu- 
man genes and will thus be a substantial part of the 
catalog and be ideal tools to carry out functional analy- 
ses. Although the 500 novel cDNAs have been fully 
sequenced and can be directly used in functional 
analysis, the cDNAs representing known genes need 
further characterization because these are not fully 
sequenced. To this end, we amplify the ORFs from 
these cDNAs and verify the predicted size. These 
ORFs are then cloned into a bacterial expression 
vector which contains a N-terminal fusion with the 
GFP. As the Gateway system (Life Technologies) is 
employed in the cloning process, the ORFs can be 
shuttled into any expression vector (Simpson et al. 
2000). Only intact reading frames (no PCR frame shifts, 
no introns, no frame shifts in the clone) lead to fluo- 
rescent colonies as the ORF extends uninterrupted into 
the GFP. The Gateway entry clones of the verified 
genes are also made available through the Resource 
Center. 

To address the systematic functional analysis of 
the novel proteins, a large-scale project dealing with 
the subcellular localization and functional analysis of 
the proteins encoded by newly identified cDNAs re- 
ported here is underway (Simpson et al. 2000). Thus, 
the gene catalog in upcoming years will form the basis 
for the large-scale and comprehensive functional 
analysis of human genes and proteins, which is crucial 
to understand the basis of human life, disease, and 
death. 

METHODS 
Library Construction 
SMART Libraries 

The DKFZp564 (human fetal brain) and DKFZp566 (human 
fetal kidney) libraries were generated using the SMART kit 
(Clontech). PCR amplification of the cDNA was necessary to 
obtain enough cDNA for cloning. The first-strand primer did 
contain the KS sequence of the pBluescript vector (Stratagene) 
and any base but T (IUB code = V) in the 3 '-terminal position 
of the primer [TCGAGGTCGACGGTATCGATAAG(T)j 9 V]. 
Amplification of the primary cDNA with Amplitaq (Perkin 
Elmer) and Phi (Stratagene) DNA polymerases in a ratio of 
19/1 (vol/vol) was carried out with primers that contained 
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uracil residues (3' primer: CAUCAUCAUCAUCGAGGTCGAC 
GGTATCGATAAG; 5' primer: CUACUACUACUATACGCT 
GCGAGAAGACGACAGAA) and that were compatible with 
the pAMPl (Life Technologies) cloning sites for directional 
cloning. Prior to cloning, the cDNA was size fractionated on 
an agarose gel. Fragments >2 kb were excised and extracted 
from the gel using GELase (Epicentre). Cloning was done us- 
ing uracil deglycosilase (UDG, LifeTechnologies) and chemi- 
cally competent bacterial cells (XL-2 Blue, Stratagene). 

Conventional Libraries 

The DKFZp434 (human adult testis), DKFZp586 (human adult 
uterus), and DKFZp761 (human adult amygdala) libraries 
were generated using conventional approaches (Gubler and 
Hoffman 1983), employing a Afofi-dT V primer for first-strand 
synthesis [GAGCGGCCGC(T) 19 V]. After second-strand syn- 
thesis, Sa/I adapters were llgated to the blunted cDNA. Then 
the cDNA was cut with Norl to generate Son-Notl-compatible 
ends at the 5' and 3' ends of the cDNA, respectively, to allow 
directional cloning. The cDNAs were then size-selected on 
agarose gels in two dimensions and cloned into pSPORTl pre- 
cut with San and Noil (Life Technologies). 

Availability of cDNA Libraries and Clones 

All libraries have been arrayed into 384-well microtiter plates 
and spotted on high-density nylon membranes. Each library 
consists of 27,000 clones or multiples thereof. High-density 
clone filters and individual clones are available through the 
Resource Center of the German Genome Project (http:// 
www.RZPD.de; done@pzpd.de). 

Selection of Clones for Sequencing 

First, 5' ESTs were systematically generated from all clones of 
384-well microtiter plates. The sequences were analyzed with 
blastn (Altschul et al. 1990) and blastx (Glsh and States 
1993) against EMBL, PIR, SW1SSPROT, and TREMBL databases 
for the lack of identical (>9S% identity over 50 bp) matches 
with known cDNAs, and for the presence of ORFs. 

Clones with novel sequences were 3' end sequenced. 
These 3' ESTs were checked for the lack of matches with 
known genes in public databases, for repeat structures, and for 
the presence of polyadenylation signals. Clones matching the 
selection criteria were subjected to full-length sequencing. 

Sequencing Methodology and Strategy 

Sequencing was done preferentially using dye terminator 
chemistry (Applied Biosystems or Amersham) on ABI 377 au- 
tomated DNA sequencers; one partner used EMBL prototype 
instruments (Wiemann et al. 1995) mainly with dye primer 
chemistry. Primer walking (Strauss et al. 1986) was the pre- 
ferred sequencing strategy for the full-length sequencing of 
cDNAs. Design of walking primers was done preferentially 
using software (e.g., Schwager et al. 1995; Haas et al. 1998) 
that permitted the complete automation of this usually-time- 
consuming process and thus helped in the parallel processing 
of large numbers of clones. 

Bioinformatic Analysis 

Every complete cDNA sequence was compared with the se- 
quences in EMBL, EMBL-EST, EMBL-STS using BLASTN 
(Altschul et al. 1990). Searches against EMBL were done to 
determine whether the cDNAs were already known and to 
identify any genomic sequence information available that 
would cover the respective genes. Searches against EMBL-EST 



were performed to analyze for the abundance of transcripts, 
to obtain information on a possible tissue specificity of ex- 
pression, and to identify putative alternative splice forms or 
alternative use of polyadenylation signals. The annotations 
on the source tissue of the respective EST clones were parsed 
from the database entries to calculate the real ratio versus the 
expected ratio of expression according to the equation: (# hits 
tissue/total # hits)/(# ESTs tissue/total # ESTs). A gene that was 
transcribed at a constant level in many tissues would have a 
ratio of one. Significant higher or lower ratios would indicate 
increased or decreased levels of transcription in the tissue, 
respectively. To identify tissue-specific expression, the param- 
eters were set to >4 ESTs matching the respective cDNA that 
needed to have been sequenced from a given tissue, and the 
cutoff for the ratio of overexpression was set to five. ESTs 
originating from pooled tissues or that were of unspecified 
origin were disregarded in this analysis. To obtain chromo- 
somal mapping information, the sequences were aligned with 
the EMBL-STS database. 

The potential protein-sequences were identified by a 
search for the longest ORF in each of the three forward frames 
with a minimum length of 90 codons. The deduced protein 
sequences were searched against the nonredundant protein 
data set of PIR, SWISSPROT, and TREMBL [blastp, using the 
SEG-filter by Wootton (1994)]. Any cDNAs without ORF >90 
codons were analyzed with blastn against TREMBL to iden- 
tify even shorter ORFs present. 

blastx searches were performed against a nonredun- 
dant protein database comprising PIR, SWISSPROT, and 
TREMBL. The SEG-filter was used to screen for potential frame 
shifts in the coding sequences of the cDNAs and to identify 
cDNAS that were not fully spliced or were alternatively 
spliced. The protein sequence was then transferred to pedant 
(Frishman and Mewes 1997). pedant performed automated 
database searches: psiBLAST (Altschul et al. 1997), an iterated 
profile search procedure; hmmer (Sonnhammer et al. 1997), a 
Hidden Markov model software which uses statistical descrip- 
tions of a sequence family's consensus; and blimps (Wallace 
and Henikoff 1992) for similarity searches against the 
BLOCKS (Henikoff et al. 2000) database. PROSITE protein se- 
quence patterns were identified by ProSearch (Kolakowski et 
al. 1992). clustal-w (Thompson et al. 1994) was used for 
multiple sequence alignments of DNA and proteins. Trans- 
membrane regions were identified by ALOM2 (Klein et al. 
1984), and signal peptides in secreted proteins by signalp 
(Nielsen et al. 1997). seg (Wootton and Federhen 1993) has 
been employed to detect low-complexity regions in protein 
sequences and coils (Lupas et al. 1991) for the detection of 
coiled coils. For the functional classification of the cDNAs 
sequence, identities with £-values <10£-30 (blastn) and 
<10£- 10 (blastx) were accepted to be significant. The com- 
prehensive bioinformatic data on all cDNAs analyzed by the 
Consortium are accessible at http://www2.mips.biochem. 
mpg.de/proj/cDNA/index.html. Mapping of the cDNAs to 
chromosomes was done first by blast analysis of the cDNA 
sequences against the human genomic sequence (NCBI-htgs 
database), followed by identifying the mapping position with 
help of the GoldenPath (Jim Kent, UCSC) browser (http:// 
genome.ucsc.edu/goldenPath/hgTracks.html). 

Availability of Clones and Further Information 

All clones described here, and the other clones analyzed by 
the German cDNA Consortium, are available from the Re- 
source Center of the German Genome Project(http:// 
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www.rzpd.de; done@rzpd.de). The comprehensive bioinfor- 
matic data on all cDNAs analyzed by the Consortium are ac- 
cessible at http://www2.mips.biochem.mpg.de/proj/cDNA/ 
index.html. Additional Information about the analysis of the 
described set of cDNAs is available at http://www.dkfz- 
heidelberg.de/abt0840/GCC. The full version of Table 2 can 
be obtained at this location in Excel, tab-delineated text, and 
pdf formats. 
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Numeric aberrations in chromosomes, referred to as aneu- 
ploidy, is commonly observed in human cancer. Whether aneu- 
ploidy is a cause or consequence of cancer has long been 
debated. Three lines of evidence now make a compelling case 
for aneuploidy being a discrete chromosome mutation event 
that contributes to malignant transformation and progression 
process. First, precise assay of chromosome aneuploidy in 
several primary tumors with in situ hybridization and compara- 
tive genomic hybridization techniques have revealed that 
specific chromosome aneusomies correlate with distinct tumor 
phenotypes. Second, aneuploid tumor cell lines and in vitro 
transformed rodent cells have been reported to display an 
elevated rate of chromosome instability, thereby indicating that 
aneuploidy is a dynamic chromosome mutation event associ- 
ated with transformation of cells. Third, and most important, a 
number of mitotic genes regulating chromosome segregation 
have been found mutated in human cancer cells, implicating 
such mutations in induction of aneuploidy in tumors. Some of 
these gene mutations, possibly allowing unequal segregations 
of chromosomes, also cause tumorigenic transformation of 
cells in vitro. In this review, the recent publications investigat- 
ing aneuploidy in human cancers, rate of chromosome instabil- 
ity in aneuploidy tumor cells, and genes implicated in regulat- 
ing chromosome segregation found mutated in cancer cells 
are discussed. Curr Opto) Oncol 2000, 1 2:82-88 C 2000 Lippineott Waiiams 
& WMdns, Inc. 
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Cancer research over rhe past decade has firmly estab- 
lished that malignant cells accumulate a large number of 
genetic mutations that affect differentiation, prolifera- 
tion, and cell death processes. In addition, it is also 
recognized that most cancers arc clonal, although they 
display extensive heterogeneity with respect to kary- 
otypes and phenotypes of individual clonal populations. 
It is estimated that numeric chromosomal imbalance, 
referred to as aneuploidy, is the most prevalent genetic 
change recorded among over 20,000 solid tumors 
analyzed thus far fl]. Phenotypjc diversity of the clonal 
populations in individual tumors involve differences in 
morphology, proliferative properties, antigen expression, 
drug sensitivity, and metastatic potentials. It has been 
proposed that an underlying acquired genetic instability 
is responsible for the multiple mutations detected in 
cancer cells that lead to tumor heterogeneity and 
progression \Z\. In a somewhat contradictory argument, 
it has also been suggested that clonal expansion due to 
selection of cells undergoing normal rates of mutation 
can explain malignant transformation and progression 
process in humans [3], Acquired genetic instability, 
nonetheless, is considered important for more rapid 
progression of the disease f4»»]. Although the original 
hypothesis on genetic instability in cancer primarily 
focused on chromosome imbalances in the form of aneu- 
ploidy in tumor cells, the actual relevance of such muta- 
tions in cancer remains a controversial issue. 

Whether or not aneuploidy contributes to the malignant 
transformation and progression process has long been 
debated. A prevalent idea on genetics of cancer referred 
to as "somatic gene mutation hypothesis" contends that 
gene mutations at the nucleotide level alone can cause 
cancer by either activating cellular proto-oncogenes to 
dominant cancer causing oncogenes and/or by inactivat- 
ing growth inhibitory tumor suppressor genes. In this 
scheme of things chromosomal instability in the form of 
aneuploidy is a mere consequence rather than a cause of 
malignant transformation and progression process. 

In this review, some of the recent observations on the 
subject arc discussed and compelling evidence is 
provided to suggest that aneuploidy is a distinct form of 
genetic instability in cancer that frequently correlates 
with specific phenotypes and stages of the disease. 
Furthermore, discrete genetic targets affecting chromo- 
somal stability in cancer cells, recently identified, arc 
also discussed. These data provide a new direction 
toward elucidating the molecular mechanisms rcspon- 
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ble for induction of ancuploidy in cancer and may even- 
tually be exploited as novel therapeutic targets in the 
future. 

Genetic alterations in cancer 

Alterations in many genetic loci regulating growth, 
senescence, and apoptosis, identified in tumor cells, 
have led to the current understanding of cancer as a 
genetic disease. The genetic changes identified in 
tumors include: subtle mutations in genes at the 
nucleotide level; chromosomal translocations leading to 
structural rearrangements in genes; and numeric 
changes in cither partial segments of chromosomes or 
whole chromosomes (ancuploidy) causing imbalance in 
gene dosage. 

For the purpose of this review, both segmental and whole 
chromosome imbalances leading to altered DNA dosage 
in cancer cells arc included as examples of ancuploidy. 

Incidence of aneuploldy in cancer 

Evidence of ancuploidy involving one or more chromo- 
somes have been commonly reported in human tumors. 
Although these observations were initially made using 
classic cytogenetic techniques late in a tumor's evolu- 
tion and were difficult to correlate with cancer progres- 
sion, more recent studies have reported association of 
specific nonrandom chromosome ancuploidy with 
different biologic properties such as loss of hormone 
dependence and metastatic potential [5]. 

Classic cytogenetic studies performed on tumor cells 
had serious limitations in scope because they were 
applicable only to those cases in which mitotic chromo- 
somes could be obtained. Because of low spontaneous 
rates of cell division in primary tumors, analyses 
depended on cells either derived selectively from 
advanced metastases or those grown in vitro for variable 
periods of time. In both instances, metaphascs analyzed 
represented only a subset of primary tumor cell popula- 
tion. Two major advances in cytogenetic analytic tech- 
niques, in situ hybridization (ISH) and comparative 
genomic hybridization (CGH), have allowed better reso- 
lution of chromosomal aberrations in freshly isolated 
tumor cells [6]. ISH analyses with chromosome-specific 
DNA probes, a powerful adjunct to metaphasic analysis, 
allows assessment of chromosomal anomalies within 
tumor cell populations in the contexts of whole nuclear 
architecture and tissue organization. CGH allows 
genome wide screening of chromosomal anomalies 
without the use of specific probes even in the absence 
of prior knowledge of chromosomes involved. Although 
both techniques have certain limitations in terms of 
their resolution power, they nonetheless provide a 
better approximation of chromosomal changes occurring 
among tumors of various histology, grade, and stage 



compared with what was possible with the classic cyto- 
genetic techniques. Genomic ploidy measurements 
have also been performed at the DNA level with flow 
cytometry and cytofluoromctric methods. Although 
these assays underestimate chromosome ploidy due to a 
chromosomal gain occasionally masking a chromosomal 
loss in the same cell, several studies using these 
methods have supported the conclusion that DNA 
ancuploidy closely associates with poor prognosis in 
various cancers [7.8]. This discussion of some recent 
examples published on aneuploidy in cancer includes 
discussion of studies dealing with DNA ploidy measure- 
ments as welt. Most of these observations are correlative 
without direct proof of specific involvement of genes on 
the respective chromosomes. Identification of putative 
oncogenes and tumor suppressor genes on gained and 
lost chromosomes in aneuploid tumors, however, arc 
providing strong evidence that chromosomes involved in 
aneuploidy play a critical role in the tumorigenic 
process. 

In renal tumors, either segmental or whole chromosome 
aneuploidy appears to be uniquely associated with 
specific histologic subtypes [9]. Tumors from patients 
with hereditary papillary renal carcinomas (HPRC) 
commonly show trisomy of chromosome 7, when 
analyzed by CGH. Germline mutations of a putative 
oncogene MET have been detected in patients with 
HPRC. A recent study [10} has demonstrated that an 
extra copy of chromosome 7 results in nonrandom dupli- 
cation of the mutant MET allele in HPRC, thereby 
implicating this trisomy in tumorigenesis. The study 
suggested that mutation of MET may render the cells 
more susceptible to errors in chromosome replication, 
and that clonal expansion of cells harboring duplicated 
chromosome 7 reflects their proliferative advantage. In 
addition to chromosome 7, trisomy of chromosome 17 in 
papillary tumors and also of chromosome R in mesoblas- 
tic nephroma arc commonly seen. Association of specific 
chromosome imbalances with benign and malignant 
forms of papillary renal tumors, therefore, not only 
contribute to an understanding of tumor origins and 
evolution, but also implicate aneuploidy of the respec- 
tive chromosomes in the tumorigenic transformation 
process. 

In colorectal tumors, chromosome aneuploidy is a 
common occurrence. In fact, molecular allelotyping 
studies have suggested that limited karyotyping data 
available from these tumors actually underestimate the 
true extent of these changes. Losses of heterozygosity 
reflecting loss of the maternal or paternal allele in 
tumors are widespread and often accompanied by a gain 
of the opposite allele. Therefore, for example, a tumor 
could lose a maternal chromosome while duplicating 
the same paternal chromosome, leaving the tumor cell 
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with a normal karyotype and ploidy but an aberrant 
ullclotypc. It has been estimated that cancer or the 
colon, breast, pancreas, or prostate may lose an average 
or 25% of its alleles. It is not unusual to discover that a 
tumor has lost over half of its alleles |4|. In clinical 
settings, DNA ploidy measurements have revealed that 
DNA ancuploidy indicates high risk of developing 
severe prcmalignant changes in patients with ulcerative 
colitis, who arc known to have an increased risk of 
developing colorectal cancer [11]. DNA ancuploidy has 
been found to be one of the useful indicators of lymph 
node metastasis in patients with gastric carcinoma and 
associated with poor outcome compared with diploid 
cases |12,13]. CGfl analyses of chromosome ancu- 
ploidy, on the other hand, was reported to correlate gain 
of chromosome 20<| with high tumor S phase fractions 
and loss of 4q with low tumor apoptotic indices [14]. 
Aneuploidy of chromosome 4 in metastatic colorectal 
cancer has recently been confirmed in studies that used 
unbiased DNA fingerprinting with arbitrarily primed 
polymerase chain reactions to detect moderate gains 
and losses of specific chromosomal DNA sequences 
[15]. The molecular karyotype (amplotype) generated 
from colorectal cancer revealed that moderate gains of 
sequences from chromosomes 8 and 13 occurred in 
most tumors, suggesting that ovcrreprcscntation of 
these chromosomal regions is a critical step for metasta- 
tic colorectal cancer. 

In addition to being implicated in cumorigenesis and 
correlated with distinct tumor phenotypes, chromosome 
ancuploidy has been used as a marker of risk assessment 
and prognosis in several other cancers. The potential 
value of aneuploidy as a noninvasive tool to identify 
individuals at high risk of developing head and neck 
cancer appears especially promising. Interphase fluores- 
cence in situ hybridization (FISH) revealed extensive 
ancuploidy in tumors from patients with head and neck 
squamous cell carcinomas (HNSCC) and also in clini- 
cally normal distant oral regions from the same individu- 
als [16,17]. It has been proposed that a panel of chromo- 
some probes in FISH analyses may serve as an 
important tool to detect subclinical tumorigencsis and 
for diagnosis of residual disease. The presence of aneu- 
ploid or tctraploid populations is seen in 90% to 95% of 
esophageal adenocarcinomas, and when seen in 
conjunction with Barrett's esophagus, a prcmalignant 
condition, predicts progression of disease [18,19]. 
Chromosome ploidy analyses in conjunction with loss of 
heterozygosity and gene mutation studies in Barrett's 
esophagus reflect evolution of neoplastic cell lineages in 
vivo [20]. Evolution of neoplastic progeny from Barrett's 
esophagus following somatic genetic mutations 
frequently involves bifurcations and loss of heterozygos- 
ity at several chromosomal loci leading to ancuploidy 
and cancer. Accordingly, it is hypothesized that during 



tumor cell evolution diploid ceil progenitors with 
somatic genetic abnormalities undergo expansion with 
acquired genetic instability. Such instability, often 
manifested in the form of increased incidence of ancu- 
ploidy, enters a phase of clonal evolution beginning in 
prcmalignant cells that proceeds over a period of time 
and occasionally leads to malignant transformation. The 
clonal evolution continues even after the emergence of 
cancer. 

The significance of DNA and chromosome ancu- 
ploidy in other human cancers continue to be evalu- 
ated. Among papillary thyroid carcinomas, ancuploid 
DNA content in tumor cells was reported to correlate 
with distant metastases, reflecting worsened progno- 
sis |2I|. Genome wide screening of follicular thyroid 
tumors by CGH, on the other hand, revealed frequent 
loss of chromosome 22 in widely invasive follicular 
carcinomas [22]. Chromosome copy number gains in 
invasive neoplasm compared with foci of ductal carci- 
noma in situ (DCIS) with similar histology have been 
proposed to indicate involvement of ancuploidy in 
progression of human breast cancer [23]. ISH analyses 
of cervical intraepithelial neoplasia has provided 
suggestive evidence that chromosomes 1, 7 and X 
ancusnmy is associated with progression toward cervi- 
cal carcinoma [24]. 

Although the prognostic value of numeric aberrations 
remains a matter of debate in human hematopoietic 
neoplasia, there have been recent studies to suggest that 
the presence of monosomy 7 defines a distinct subgroup - 
of acute myeloid leukemia patients [25]. It is interesting 
in this context that therapy-related myelodysplasia- 
syndmmcs have been reported to display monosomy 5 
and 7 karyotypes, reflecting poor prognosis [26). 

The clinical observations, mentioned previously, arc 
supported by in vitro studies in human and rodent cells in 
which aneuploidy is induced at early stages of transforma- 
tion [27,28]. It is even suggested that ancuploidy may 
cause cell immortalization, in some instances, that is a 
critical step proceeding transformation. 

Finally, in an interesting study to develop transgenic 
mouse models of human chromosomal diseases, chromo- 
some segment specific duplication and deletions of the 
genome were reported to be constructed in mouse 
embryonic stem cells [29]. Three duplications for a 
portion of mouse chromosome 11 syntcnic with human 
chromosome 17 were established in the mouse 
germline. Mice with 1Mb duplication developed corneal 
hyperplasia and thymic tumors. The findings represent 
the first transgenic mouse model of ancuploidy of a 
defined chromosome segment that documents the direct 
role of chromosome ancusomy in tumorigencsis. 
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Aneuploidy as "dynamic cancer-causing 
mutation" instead of a "consequential state" 
in cancer 

According to the hypothesis previously discussed, aneu- 
ploidy represents either a "gain of function" or "loss of 
function" mutation at the chromosome Jcvcl with a 
causative influence on the tumorigenesis process. The 
hypothesis, however, is based only on circumstantial 
evidence even though existence of aneuploidy is corre- 
lated with different tumor phenorypes. The existence of 
numeric chromosomal alterations in a tumor docs not 
mean that the change arose as a dynamic mutation due 
to genomic instability, because several factors could lead 
to consequential aneuploidy in tumors, also. Although 
aneuploidy as a dynamic mutation due to genomic insta- 
bility in tumor cells would occur at a certain measurable 
rate per cell generation, a consequential state of aneu- 
ploidy in tumors may not occur at a predictable rate 
under similar conditions or in tumors with similar 
phenotypes. In addition to genomic instability, differ- 
ences in environmental factors wirh selective pressure, 
could explain high incidence of aneuploidy and other 
somatic mutations in tumors compared with normal cells 
[4)- These include humoral, cell substratum, and cell- 
cell interaction differences between tumor and normal 
cell environments. It could be argued that despite 
similar rates of spontaneous aneuploidy induction in 
normal and tumor cells, the latter are selected to prolif- 
erate due to altered selective pressure in the tumor cell 
environment, whereas the normal cells arc eliminated 
through activation of apoptosis. Alternatively, of course, 
one could postulate that selective expression or overcx- 
prcssion of anti-apnptotic proteins or inactivation of 
proapoptotic proteins in tumor cells may counteract 
default induction of apoptosis in G2/M phase cells 
undergoing missegrcgacion of chromosomes. Recent 
demonstration of overcxprcssion of a G2/M phase anti- 
apoptotic protein survivin in cancer cells [30] suggests 
that this protein may favor aberrant progression of aneu- 
ploid transformed cells through mitosis. This would 
then lead to proliferation of aneuploid cell lineages, 
which may undergo clonal evolution. 

To ascertain that aneuploidy is a dynamic mutational 
event, various human tumor cell lines and transformed 
rodent cell lines have been analyzed for the rate of 
aneuploidy induction. When grown under controlled in 
vitro conditions, such conditions ensure that environ- 
mental factors do not influence selective proliferation of 
cells with chromosome instability. In one study, 
Lcngaucr etal. [.11 •] provided unequivocal evidence by 
FISH analyses that losses or gains of multiple chromo- 
somes occurred in excess of 10"* per chromosome per 
generation in aneuploid colorectal cancer cell lines. The 
study further concluded that such chromosomal instabil- 
ity appeared to be a dominant trait. Using another in 



vitro model system of Chinese hamster embryo (CHE) 
cells, Duesbcrg tt al. [32*1 have also obtained similar 
results. With clonal cultures of CHE cells, transformed 
with hongenotoxic chemicals and a mitotic inhibitor, 
these authors demonstrated that the overwhelming 
majority of the transformed colonies contained more 
than 50% aneuploid cells, indicating that aneuploidy 
would have originated from the same cells that under- 
went transformation. All the transformed colonies tested 
were tumorigenic. It was further documented that the 
ploidy factor representing the quotient of the modal 
chromosome number divided by the normal diploid 
number, in each clone, correlated directly with the 
degree of chromosomal instability. Therefore, chromo- 
somal instability was found proportional to the degree of 
aneuploidy in the transformed cells and the authors 
hypothesized that aneuploidy is a unique mechanism of 
simultaneously altering and destabilizing, in a. massive 
manner, the normal cellular phenotypes. In the absence 
of any evidence that the transforming chemicals used in 
the study did not induce other somatic mutations, it is 
difficult to rule out the contribution of such mutations 
in the transformation process. These results nonetheless 
make a strong case for aneuploidy being a dynamic chro- 
mosome mutation event intimately associated with 
cancer. 

Aneuploidy versus somatic gene mutation In 
cancer 

The idea that numeric chromosome imbalance or aneu- 
ploidy is a direct cause of cancer was proposed at the 
turn of the century by Theodore Bovcri [33|. However, 
the hypothesis was largely ignored over the last several 
decades in favor of the somatic gene mutation hypothe- 
sis, mentioned earlier. Evidence accumulating in the 
literature lately on specific chromosome ancusomics 
recognized in primary tumors, incidence of aneuploidy 
in cells undergoing transformation, and aneuploid tumor 
cells showing a high rate of chromosome instability have 
led to the rejuvenation of Boveri's hypothesis. The 
concept has recently been discussed as a "vintage wine 
in a new bottle" [34*]. The author points out that 
except for rare cancers caused by dominant retroviral 
oncogenes, diploidy does not seem to occur in solid 
tumors, whereas aneuploidy is a rule rather than excep- 
tion in cancer. 

Aneuploidy as an effective mutagenic mechanism 
driving tumor progression, on the other hand, is being 
recognized as a viable solution to the paradox that with 
known mutation rate in nnn-gcrmline cells (-I0" 7 per 
gene per cell generation) tumor cell lineages cannot 
accumulate enough mutant genes during a human life- 
time [35]. The concept is gaining significant credibility 
since genes that potentially affect chromosome segrega- 
tion were found mutated in human cancer. Some of 
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these genes have also been shown to have transforming 
capability in in vitro assays. Selected recent publications 
describing the findings are being discussed below in 
reference to the mitotic targets potentially involved in 
inducing chromosome segregation anomalies in cells. 

Potential mitotic targets and molecular 
mechanisms of aneuploidy 

Because aneuploidy represents numeric imbalance in 
chromosomes, it is reasonable to expect that aneuploidy 
arises due to missegrcgation of chromosomes during cell 
division. There arc many potential mitotic targets, 
which could cause unequal segregation of chromosomes 
(Fig. I). Recent investigations have identified several 
genes involved in regulating these mitotic targets and 
mitotic checkpoint functions, which can be implicated 
in induction of aneuploidy in tumor cells. This discus- 
sion is restricted to those mitotic targets and checkpoint 
genes whose abnormal functioning has been observed in 
cancer or has been shown to cause tumorigenic transfor- 
mation of cells, in recent years. The role of telomeres is 
discussed elsewhere in this issue. For a more detailed 
description of the components of mitotic machinery and 
their possible involvement in causing chromosome 
segregation abnormalities in tumor cells, readers may 
refer to a recently published review [36*]. 

Among the mitotic targets implicated in cancer, ccntro- 
somc defects have been observed in a wide variety of 
malignant human tumors. Ccritrosomcs play a central role 
in organizing the microtubule network in interphase cells 
and mitotic spindle during cell division. Multipolar 
mitotic spindles have been observed in human cancers in 
situ and abnormalities in the form of supernumerary 



Figure 1. Potential mitotic targets causing aneuploidy In 
oncogenesis 
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Diagram Bustrates thai defects in several processes involving chromosomal, 
spindle microtubule, and centrosomal targets, in addition to abnormal cytokine- 
lis, may cause unequal partitioning of chromosomes during mitosis, leading to 
aneuploidy. Recently obtained evidence in favor of some of these possibilities is 
discussed in the lent. 



ccntrosomcs, cenrrosomcs of aberrant size and shape as 
well as aberrant phosphorylation of ccntrosomc proteins 
have been reported in prostate, colon, brain, and breast 
tumors [37,38]. In view of the findings that abnormal 
ccntrosomcs retain the ability to nucleate microtubules in 
vitro, it is conceivable that cells with abnormal centro- 
somcs may missegregate chromosomes producing ancu- 
ploid cells. The molecular and genetic bases of abnormal 
centrosomc generation and the precise pathway through 
which they regulate the chromosome segregation process 
remain to be elucidated. Recent discovery of a ccntro- 
somc-assoeiated kinase STK15/BTAK/aurora2, naturally 
amplified and overexpresscd in hitman cancers, has raised 
the interesting possibility that aberrant expression of this 
kinase is critically involved in abnormal ccntrosomc func- 
tion and unequal chromosome segregation in tumor cells 
[39,40]. Exogenous expression of the kinase in rodent and 
human cells was found to correlate with an abnormal 
number of ccntrosomcs, unequal partitioning of chromo- 
somes during division, and tumorigenic transformation of 
cells. It is relevant in this context to mention that the 
Xcnopus homologuc of human STK 1 5/BTAK/aurora2 
kinase has recently been shown to phosphorylatc a micro- 
tubule motor protein XIEg5, the human orthologiic of 
which is known to participate in the centrosomc separa- 
tion during mitosis [41]. Findings on STK15/aurora2 
kinase, thus, provide an interesting lead to a possible 
molecular mechanism of ccntrosomc's role in oncogene- 
sis. Ccntrosomcs have, of late, been implicated in onco- 
genesis from studies revealing supernumerary ccntro- 
somcs in />5.?-deficicm fibroblasts and overexprcssion of 
another centrosomc kinase PLK1 being detected in 
human non-small cell lung cancer [42]. 

One of the critical events that ensures equal partition- 
ing of the chromosomes during mitosis is the proper 
and timely separation of sister chromatids that arc 
attached to each other and to the mitotic spindle. 
Untimely separation of sister chromatids has been 
suspected as a cause of aneuploidy in human tumors. 
Cohesion between sister chromatids is established 
during replication of chromosomes and is retained until 
the next mctaphasc/anaphase transition. It has been 
shown that during mctaphasc-anaphase transition, the 
anaphase promoting complex/cyclosome triggers the 
degradation of a group of proteins called sccurins that 
inhibit sister chromatid separation. A vertebrate sectirin 
(y-sccurin) has recently been identified that inhibits 
sister chromatid separation and is involved in transfor- 
mation and tumorigencsis. Subsequent analysis 
revealed that the human sccurin is identical to the 
product of the gene called pituitary tumor transforming 
gene, which is overexpresscd in some tumors and 
exhibits transforming activity in NIH3T3 cells. Jt is 
proposed that elevated expression of the v-sccurin may 
contribute to generation of malignant tumors due to 



chromosome gain or loss produced by errors in chro- 
matid separation [43»1- 

Normal progression through mitosis during prophase to 
anaphase transition is monitored at least at two check- 
points: One checkpoint operates during early prophase 
at GZ to meraphase progression while the second 
ensures proper segregation of chromosomes during 
mctaphasc to anaphase transition. Several mitotic 
checkpoint genes responding to mitotic spindle defects 
have been identified in yeast. The mctaphase-anaphasc 
transition is delayed following activation of this check- 
point during which kinetochorcs remain unattached to 
the spindle. The signal is transmitted through a kineto- 
chorc protein complex consisting of Mpslp and several 
Mad and Bub proteins [44]. It is expected that for 
unequal chromosome segregation to be perpetuated 
through cell proliferation cycles giving rise, to aneu- 
ploidy, checkpoint controls have to be abrogated. 

Following this logic, Vogclstcin eta/. [45«] hypothesized 
that ancuploid tumors would reveal mutation in mitotic 
spindle checkpoint genes. Subsequent studies by these 
investigators have proven the validity of this hypothesis 
and a small fraction of human colorectal cancers have 
revealed the presence of mutations in either hBubl or 
hBubRl checkpoint genes. It was further revealed that 
mutant BUB1 could function in a dominant negative 
manner conferring an abnormal spindle checkpoint 
when expressed exogenously. Inactivation of spindle 
checkpoint function in virally induced leukemia has also 
recently been documented following the finding that 
hMADI checkpoint protein is targeted by the Tax 
protein of the human T-ccll leukemia virus type 1. 
Abrogation of hMADI function leads to multinuclcation 
and ancuploidy [46], 

In addition to mitotic spindle checkpoint defects, failed 
DNA damage checkpoint function in yeast is frequently 
associated with aberrant chromosome segregation as 
well. It, therefore, appears intriguing yet relevant that 
the human BRCAi gene, proposed to be involved in 
DNA damage checkpoint function, when mutated by a 
targeted deletion of exon 1 1 led to defective G2/M cell 
cycle checkpoint function and genetic instability in 
mouse embryonic fibroblasts [47]. The cells revealed 
multiple functional centrosomcs and unequal chromo- 
some segregation and ancuploidy. Although the molecu- 
lar basis for these abnormalities is not known at this 
time, it raises the interesting possibility that such an 
ancuploidy-drivcn mechanism may be involved in 
tumorigenesis in individuals carrying germline muta- 
tions of BRCA I gene. 
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Conclusion 

Growing evidence from human tumor cytogenetic inves- 
tigations strongly suggest that ancuploidy is associated 
with the development of tumor phenorypes. Clinical 
findings of correlation between ancuploidy and tumori- 
genesis arc supported by studies with in vitro grown 
transformed cell lines. Molecular genetic analyses of 
tumor cells provide credible evidence that mutations in 
genes controlling chromosome segregation during 
mitosis play a critical role in causing chromosome insta- 
bility leading to ancuploidy in cancer. Further elucida- 
tion of molecular and physiologic bases of chromosome 
instability and ancuploidy induction could lead to the 
development of new therapeutic approaches for 
common forms of cancer. 
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Describes oncogenic property of oentroaome associated STK1 67aurora2 kinase 
snd its involvement in aneuptoidy induction. 

40 Bischoff JR, Anderson L Shu Y, Morsie K. Ng I. Chan CS. et at.: A homo- 

• togue of Drosophira aurora kinase is oncogenic end amplified in human 
colorectal cancers. EMBO J 1 998. 1 7:3052-3065. 

Describee oncogenic property ol STK15/aurora2 kinase and involvement in 
colorectal cancers. 
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Figure 304 shows the amino acid sequence (SEQ ID NO:422) derived from the coding sequence of 
SEQ ID NO:421 shown in Figure 303. 

Figure 305 shows a nucleotide sequence (SEQ ID NO:423) of a native sequence PROB84 (UNQ721) 
cDNA, wherein SEQ ID NO:423 is a clone designated herein as "DNA71 159-1617". 

Figure 306 shows the amino acid sequence (SEQ ID NO:424) derived from the coding sequence of 
SEQ ID NO:423 shown in Figure 305. 

Figure 307 shows a nucleotide sequence (SEQ ID NO:494) of a native sequence PR0183 cDNA, 
wherein SEQ ID NO:494 is a clone designated herein as "DNA28498". 

Figure 308 shows the amino acid sequence (SEQ ID NO:495) derived from the coding sequence of 
SEQ ID NO:494 shown in Figure 307. 

Figure 309 shows a nucleotide sequence (SEQ ID NO:496) of a native sequence PR0184 cDNA, 
wherein SEQ ID NO:496 is a clone designated herein as "DNA28500". 

Figure 310 shows the amino acid sequence (SEQ ID NO:497) derived from the coding sequence of 
SEQ ID NO:496 shown in Figure 309. 

Figure 311 shows a nucleotide sequence (SEQ ID NO:498) of a native sequence PR0185 cDNA, 
wherein SEQ ID NO:498 is a clone designated herein as "DNA28503". 

Figure 312 shows the amino acid sequence (SEQ ID NO:499) derived from the coding sequence of 
SEQ ID NO:498 shown in Figure 311. 

Figure 313 shows a nucleotide sequence (SEQ ID NO:500) of a native sequence PR0331 cDNA, 
wherein SEQ ID NO:500 is a clone designated herein as "DNA40981-1234". 

Figure 314 shows the amino acid sequence (SEQ ID NO:501) derived from the coding sequence of 
SEQ ID NO:500 shown in Figure 313. 

Figure 315 shows a nucleotide sequence (SEQ ID NO:502) of a native sequence PR0363 cDNA, 
wherein SEQ ID NO:502 is a clone designated herein as "DNA45419-1252". 

Figure 316 shows the amino acid sequence (SEQ ID NO:503) derived from the coding sequence of 
SEQ ID NO:502 shown in Figure 315. 

Figure 317 shows a nucleotide sequence (SEQ ID NO:504) of a native sequence PR05723 cDNA, 
wherein SEQ ID NO:504 is a clone designated herein as "DNA82361". 

Figure 318 shows the amino acid sequence (SEQ ID NO:505) derived from the coding sequence of 
SEQ ID NO:504 shown in Figure 317. 

Figure 319 shows a nucleotide sequence (SEQ ID NO:506) of a native sequence PRO3301 cDNA, 
wherein SEQ ID NO: 506 is a clone designated herein as "DNA88002". 

Figure 320 shows the amino acid sequence (SEQ ID NO:507) derived from the coding sequence of 
SEQ ID NO:506 shown in Figure 319. 

Figure 321 shows a nucleotide sequence (SEQ ID NO:508) of a native sequence PRO9940 cDNA, 
wherein SEQ ID NO:508 is a clone designated herein as "DNA92282". 

Figure 322 shows the amino acid sequence (SEQ ID NO: 509) derived from the coding sequence of 
SEQ ID NO: 508 shown in Figure 321 . 
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Figure 323 shows a nucleotide sequence (SEQ ID NO:510) of a native sequence PR09828 cDNA, 
wherein SEQ ID NO:510 is a clone designated herein as "DNA142238-2768". 

Figure 324 shows the amino acid sequence (SEQ ID NO:51 1) derived from the coding sequence of 
SEQ ID NO:510 shown in Figure 323. 

Figure 325 shows a nucleotide sequence (SEQ ID NO:512) of a native sequence PRO7170 cDNA, 
wherein SEQ ID NO:512 is a clone designated herein as "DNA 108722-2743". 

Figure 326 shows the amino acid sequence (SEQ ID NO:513) derived from the coding sequence of 
SEQ ID NO:5 12 shown in Figure 325. 

Figure 327 shows a nucleotide sequence (SEQ ID NO:514) of a native sequence PR0361 cDNA, 
wherein SEQ ID NO:514 is a clone designated herein as "DNA45410-1250". 

Figure 328 shows the amino acid sequence (SEQ ID NO:515) derived from the coding sequence of 
SEQ ID NO:514 shown in Figure 327. 

Figure 329 shows a nucleotide sequence (SEQ ID NO:516) of a native sequence PR0846 cDNA, 
wherein SEQ ID NO:516 is a clone designated herein as "DNA44196-1353". 

Figure 330 shows the amino acid sequence (SEQ ID NO:517) derived from the coding sequence of 
SEQ ID NO:516 shown in Figure 329. 

DETAILED DESCR IPTI QN OF THR PREFERRED EMBODIMENTS 
I- Definitions 

The terms "PRO polypeptide" and "PRO" as used herein and when immediately followed by a 
numerical designation refer to various polypeptides, wherein the complete designation (i.e., PRO/number) 
refers to specific polypeptide sequences as described herein. The terms "PRO/number polypeptide" and 
"PRO/number" wherein the term "number" is provided as an actual numerical designation as used herein 
encompass native sequence polypeptides and polypeptide variants (which are further defined herein). The PRO 
polypeptides described herein may be isolated from a variety of sources, such as from human tissue types or 
from another source, or prepared by recombinant or synthetic methods. 

A "native sequence PRO polypeptide" comprises a polypeptide having the same amino acid sequence 
as the corresponding PRO polypeptide derived from nature. Such native sequence PRO polypeptides can be 
isolated from nature or can be produced by recombinant or synthetic means. The term "native sequence PRO 
polypeptide" specifically encompasses naturally-occurring truncated or secreted forms of the specific PRO 
polypeptide (e.g., an extracellular domain sequence), naturally-occurring variant forms (e.g., alternatively 
spliced forms) and naturally-occurring allelic variants of the polypeptide. In various embodiments of the 
invention, the native sequence PRO polypeptides disclosed herein are mature or full-length native sequence 
polypeptides comprising the full-length amino acids sequences shown in the accompanying figures. Start and 
stop codons are shown in bold font and underlined in the figures. However, while the PRO polypeptide 
disclosed in the accompanying figures are shown to begin with methionine residues designated herein as amino 
acid position 1 in the figures, it is conceivable and possible that other methionine residues located either 
upstream or downstream from the amino acid position 1 in the figures may be employed as the starting amino 

304 



# 



acid residue for the PRO polypeptides. 

The PRO polypeptide "extracellular domain" or "ECD" refers to a form of the PRO polypeptide 
which is essentially free of the transmembrane and cytoplasmic domains. Ordinarily, a PRO polypeptide ECD 
will have less than 1 % of such transmembrane and/or cytoplasmic domains and preferably, will have less lhan 
0.5% of such domains. It will be understood that any transmembrane domains identified for the PRO 
polypeptides of the present invention are identified pursuant to criteria routinely employed in the art for 
identifying that type of hydrophobic domain. The exact boundaries of a transmembrane domain may vary but 
most likely by no more than about 5 amino acids at either end of the domain as initially identified herein. 
Optionally, therefore, an extracellular domain of a PRO polypeptide may contain from about 5 or fewer amino 
acids on either side of the transmembrane domain/extracellular domain boundary as identified in the Examples 
or specification and such polypeptides, with or without the associated signal peptide, and nucleic acid encoding 
them, are comtemplated by the present invention. 

The approximate location of the "signal peptides" of the various PRO polypeptides disclosed herein 
are shown in the present specification and/or the accompanying figures. It is noted, however, that the C- 
terminal boundary of a signal peptide may vary, but most likely by no more than about 5 amino acids on either 
side of the signal peptide C-terminal boundary as initially identified herein, wherein the C-terminal boundary 
of the signal peptide may be identified pursuant to criteria routinely employed in the art for identifying that type 
of amino acid sequence element (e.g., Nielsen et al., Prot. Eng . 10:1-6 (1997) and von Heinje et al., Nud 
AddS - ReS - 14:4683 "4690 (1986)). Moreover, it is also recognized that, in some cases, cleavage of a signal 
sequence from a secreted polypeptide is not entirely uniform, resulting in more than one secreted species. 
These mature polypeptides, where the signal peptide is cleaved within no more than about 5 amino acids on 
either side of the C-terminal boundary of the signal peptide as identified herein, and the polynucleotides 
encoding them, are contemplated by the present invention. 

"PRO polypeptide variant" means an active PRO polypeptide as defined above or below having at least 
about 80% amino acid sequence identity with a full-length native sequence PRO polypeptide sequence as 
disclosed herein, a PRO polypeptide sequence lacking the signal peptide as disclosed herein, an extracellular 
domain of a PRO polypeptide, with or without the signal peptide, as disclosed herein or any other fragment 
of a full-length PRO polypeptide sequence as disclosed herein. Such PRO polypeptide variants include, for 
instance, PRO polypeptides wherein one or more amino acid residues are added, or deleted, at the N- or C- 
terminus of the full-length native amino acid sequence. Ordinarily, a PRO polypeptide variant will have at 
least about 80% amino acid sequence identity, preferably at least about 81% amino acid sequence identity, 
more preferably at least about 82% amino acid sequence identity, more preferably at least about 83% amino 
acid sequence identity, more preferably at least about 84% amino acid sequence identity, more preferably at 
least about 85% amino acid sequence identity, more preferably at least about 86% amino acid sequence 
identity, more preferably at least about 87% amino acid sequence identity, more preferably at least about 88% 
amino acid sequence identity, more preferably at least about 89% amino acid sequence identity, more 
preferably at least about 90% amino acid sequence identity, more preferably at least about 91 % amino acid 
sequence identity, more preferably at least about 92% amino acid sequence identity, more preferably at least 
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about 93% amino acid sequence identity, more preferably at least about 94% amino acid sequence identity, 
more preferably at least about 95% amino acid sequence identity, more preferably at least about 96% amino 
acid sequence identity, more preferably at least about 97% amino acid sequence identity, more preferably at 
least about 98% amino acid sequence identity and most preferably at least about 99% amino acid sequence 
identity with a full-length native sequence PRO polypeptide sequence as disclosed herein, a PRO polypeptide 
sequence lacking the signal peptide as disclosed herein, an extracellular domain of a PRO polypeptide, with 
or without the signal peptide, as disclosed herein or any other specifically defined fragment of a full-length 
PRO polypeptide sequence as disclosed herein. Ordinarily, PRO variant polypeptides are at least about 10 
amino acids in length, often at least about 20 amino acids in length, more often at least about 30 amino acids 
in length, more often at least about 40 amino acids in length, more often at least about 50 amino acids in 
length, more often at least about 60 amino acids in length, more often at least about 70 amino acids in length, 
more often at least about 80 amino acids in length, more often at least about 90 amino acids in length, more 
often at least about 100 amino acids in length, more often at least about 150 amino acids in length, more often 
at least about 200 amino acids in length, more often at least about 300 amino acids in length, or more. 

"Percent (%) amino acid sequence identity" with respect to the PRO polypeptide sequences identified 
herein is defined as the percentage of amino acid residues in a candidate sequence that are identical with the 
amino acid residues in the specific PRO polypeptide sequence, after aligning the sequences and introducing 
gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative 
substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid 
sequence identity can be achieved in various ways that are within the skill in the art, for instance, using 
publicly available computer software such as BLAST, BLAST-2, ALIGN or Megalign (DNASTAR) software. 
Those skilled in the art can determine appropriate parameters for measuring alignment, including any 
algorithms needed to achieve maximal alignment over the full length of the sequences being compared. For 
purposes herein, however, % amino acid sequence identity values are generated using the sequence comparison 
computer program ALIGN-2, wherein the complete source code for the ALIGN-2 program is provided in 
Table 1 below. The ALIGN-2 sequence comparison computer program was authored by Genentech, Inc. and 
the source code shown in Table 1 below has been filed with user documentation in the U.S. Copyright Office, 
Washington D.C., 20559, where it is registered under U.S. Copyright Registration No. TXU510087. The 
ALIGN-2 program is publicly available through Genentech, Inc., South San Francisco, California or may be 
compiled from the source code provided in Table 1 below. The ALIGN-2 program should be compiled for 
use on a UNIX operating system, preferably digital UNIX V4.0D. All sequence comparison parameters are 
set by the ALIGN-2 program and do not vary. 

In situations where ALIGN-2 is employed for amino acid sequence comparisons, the % amino acid 
sequence identity of a given amino acid sequence A to, with, or against a given amino acid sequence B (which 
can alternatively be phrased as a given amino acid sequence A that has or comprises a certain % amino acid 
sequence identity to, with, or against a given amino acid sequence B) is calculated as follows: 



100 times the fraction X/Y 
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